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haJCLEOTIDE AND AMINO ACID SEQUENCES OF HYPERVARIABLE REGION I OF THE 
ENVELOPE 2 GENE OF HEPATITIS C VIRUS 



Field Of Invention 
The present invention is in the field of 
hepatitis virology. The invention relates to the 
'® nucleotide and deduced amino acid sequences of 

hypervariable region 1 of the envelope 2 (E2) gene of 
hepatitis C virus (HCV) isolates from around the world and 
the grouping of these hypervariable sequences into 
distinct HCV genotypes. More specifically, this invention 

1^ relates to diagnostic methods and vaccines which employ 
nucleic acid sequences and recombinant or synthetic 
proteins derived from these hypervariable sequences. 

Background Of Invention 
Hepatitis C, originally called non-A, non-B 

20 hepatitis, was first described in 1975 as a disease 

serologically distinct from hepatitis A and hepatitis B 
(Feinstone, S.M. et al . (1975) N. Engl . J. Med. . 292:767- 
770) . Although hepatitis C was (and is) the leading type 
of transfusion-associated hepatitis as well as an 

25 important part of community- acquired hepatitis, little 

progress was made in understanding the disease until the 
recent identification of hepatitis C virus (HCV) as the 
causative agent of hepatitis C via the cloning and 
sequencing of the HCV genome (Choo, A.L. et al. (19 89) 

30 Science , 288:359-362). The sequence information generated 
by this study resulted in the characterization of HCV as a 
small, enveloped, positive -stranded RNA virus and led to 
the demonstration that HCV is a major cause of both acute 
and chronic hepatitis worldwide (Weiner, A.J. et al . 

33 (1990) Lancet, 335:1-3). Subsequently, it has been 

observed that approximately 80% of individuals acutely 
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infected with HCV become chronically infected and more 
than 2 0% of these individuals eventually develop liver 
cirrhosis (Alter, H.J. Seeff, L.B.: Transfusion 
Associated Hepatitis, In; Zuckerman, A.J. Thomas, H.C. 
(eds) : Viral Hepatitis: Scientific Basis and Clinical 
Management. Edinburgh Churchill Livingstone, 1993). In 
addition, a strong association has been found between HCV 
infection and the development of hepatocellular carcinoma 
(Bukh et al. (1993) Proc . Natl. Acad. Sci . USA, 90:1848- 
1851) and HCV infection also seems to be associated with 
other diseases, including some autoimmune diseases (Manns, 
M.P. (1993) Intervirol . , 35:108-115; Lionel, F, (1994) 
Gastroenterology . 107:1550-1555). Thus, significant 
morbidity and mortality is caused by HCV infection 
worldwide and vaccine development is a high priority. 

Choo et al. {(1994) Proc. Natl. Acad. Sci. USA , 
91:1294-1298), using recombinant El and E2 proteins of 
HCV-1 as immunogens, reported the successful vaccination 
of chimpanzees against challenge with IOCID50 of the 
homologous strain of HCV. However, Choo et al . did not 
demonstrate protection against challenge with a 
heterologous strain of HCV and the recent discovery of the 
extraordinary diversity of HCV genomes based on sequence 
analysis of numerous HCV isolates (Bukh et al . ; Proc . 
Natl. Acad. Sci. USA , (1993) 90:8234-8238, Bukh et al . 
(1994) Proc. Natl. Acad. Sci. USA . 91:8239-8243) suggests 
that a successful vaccine must protect against challenge 
by multiple strains of HCV. In addition, both Farci et 
al. (Farci, P. et al . (1992) Science , 258:135-140) and 
Prince et al . (Prince, A.M. et al . (1992) J. Infect . Pis . . 
165:438-443) have presented evidence that while infection 
with one strain of HCV does modify the degree of the 
hepatitis C associated with the reinfection, it does not 
protect against reinfection with a closely related strain. 

One possible candidate for use as a immunogen in 
35 a vaccine protective against multiple strains of HC7 is a 
short region within the E2 gene termed hypervariable 
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region l (HVRi) that has many similarities to the V3 loop 
of HIV, which represents the principal neutralizing domain 
of HIV (Letvin, N.L. (1993) N. Engl . J. Med. . 329:1400). 
Indeed, the recent demonstration that antibodies specific 
to HVRl can neutralize HCV in an in vitro binding assay 
(Zibert, A, et al . (1995) Virology . 208:653-661) suggests 
that HVRl may be a principal neutralization determinant of 
HCV. Thus, the identification of HVRl sequences from 
multiple HCV isolates of different genotypes may be useful 
in developing an immunogen capable of stimulating a 
protective immune response against challenge by infection 
with HCV isolates. 

Summary of Invention 
The present invention relates to the nucleotide 
and deduced amino acid sequences of hypervariable region 1 
(HVRl) of the envelope 2 {E2) gene of 49 human hepatitis C 
virus (HCV) isolates. 

The invention also relates to proteins derived 
from the hypervariable sequences disclosed herein. These 
proteins may be synthesized chemically or may be produced 
recombinantly by inserting hypervariable nucleic acid 
sequences into an expression vector and expressing the 
recombinant protein in a host cell. 

The invention further relates to the use of 
these proteins, either alone, or in combination with each 
other, as diagnostic agents and as vaccines. 

The invention further relates to the use of 
expression vectors containing the hypervariable nucleic 
acid sequences of the present invention as nucleic acid 
based vaccines. 

This invention therefore relates to 
pharmaceutical compositions useful in prevention or 
treatment of hepatitis C in a mammal. 

Tha invention also relates to the use of single - 
stranded antisense poly- or oligonucleotides derived from 
35 HVRl nucleic acid sequences to inhibit expression of 
hepatitis C E2 gem-^s . 
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The invention further relates to multiple 
computer- generated alignments of the nucleotide and 
deduced amino acid sec[uences of the HVRl sequences. These 
multiple sequence alignments produce consensus sequences 
which serve to highlight regions of homology and non- 
homology between sequences found within the same genotype 
or in different genotypes and hence, these alignments can 
be used by those of ordinary skill in the art to design 
proteins and nucleic acid sequences useful as reagents in 
diagnostic assays and vaccines. 

The present invention also encompasses methods 
of detecting antibodies specific for hepatitis C virus in 
biological samples. The methods of detecting HCV or 
antibodies to HCV disclosed in the present invention are 
useful for diagnosis of infection and disease caused by 
HCV and for monitoring the progression of such disease. 
Such methods are also useful for monitoring the efficacy 
of therapeutic agents during the course of treatment of 
HCV infection and disease in a mammal. 

The invention also provides a kit for the 
detection of antibodies specific for HCV in a biological 
sample where said kit contains at least one purified and 
isolated protein derived from the hypervariable sequences. 

The invention also relates to methods for 
detecting the presence of hepatitis C virus in a mammal, 
said methods comprising analyzing the RNA of a mammal for 
the presence of hepatitis C virus. These methods can be 
used to identify specific isolates of hepatitis C virus 
present in a mammal which is useful in determining the 
proper course of treatment for an HCV- infected patient. 

The invention also provides a diagnostic kit for 
the detection of hepatitis C virus in a biological sample. 
The kit comprises purified and isolated nucleic acid 
sequences useful as primers for reverse- transcription 
polymerase chain reaction (RT-PCR) analysis of RNA for the 
35 presence of hepatitis C virus genomic RNA. 

The invention also relates to antibodies to the 
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HVRl proteins of the present invention and the use of such 
antibodies in passive immunoprophylaxis . 

Description of Figures 
Figures 1 A-K show computer generated secjuence 
alignments of the nucleotide sequences of the HVRl region 
of the E2 gene of 49 HCV isolates. The single letter 
abbreviations used for the nucleotides shown in Figures 
lA-K are those standardly used in the art. Figure lA 
shows the alignment of SEQ ID N0s:l-8 to produce a 
consensus sequence for subtype I/la. Figure IB shows the 
alignment of SEQ ID NOs:9-25 to produce a consensus 
sequence for subtype Il/lb. Figure IC shows the alignment 
of SEQ ID NOS:l-25 to produce a consensus for genotype 1 
where genotype 1 comprises subtypes la (SEQ ID N0s:l-8) 
and lb (SEQ ID NOs:9-25). Figure ID shows the alignment 
of SEQ ID NOs:26-29 to produce a consensus sequence for 
subtype III/2a. Figure IE shows the alignment of SEQ ID 
NOs:30-32 to produce a consensus sequence for subtype 
IV/2b. Figure IF shows the alignment of SEQ ID NOs:26-33 
to produce a consensus sequence for genotype 2 where 
genotype 2 comprises subtypes 2a (SEQ ID NOs:26-29), 2b 
(SEQ ID NOs:30-32) and 2c (SEQ ID NO:33). Figure IG shows 
the alignment of SEQ ID NOs:34-38 to produce a consensus 
sequence for genotype V/3a. Figure IH shows the computer 
alignment of SEQ ID NOs:41-42 to produce a consensus 
sequence for subtype 4c. Figure II shows the alignment of 
SEQ ID NOs: 39-43 to produce a consensus sequence for 
genotype 4 where genotype 4 comprises subtypes 4a (SEQ ID 
NO:39), 4b (SEQ ID N0:40), 4c (SEQ ID NOs:41-42) and 4d 
(SEQ ID NO:43) . Figure IJ shows the alignment of SEQ ID 
NOs: 44 -48 to produce a consensus sequence for genotype 5a. 
Figure IK shows t;.he alignment of the HVRl sequences of the 
49 HCV isolates (SEQ ID NOs : 1-49) to produce a consensus 
sequence for all genotypes. The nucleotides shown in 
capital letters in the consensus sequences of Figures lA- 
IK are those conserved within a genotype (Figure lA-J) or 
among all isolates (Figure IK) while nucleotides shown in 
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lower case letters in the consensus sequences are those 
variable within a genotype (Figure lA-J) or among all 
isolates (Figure IK) . In addition, when the lower case 
letter is shown in a consensus sequence, the lower case 
letter represents the nucleotide found most frequently in 
the sequences aligned to produce the consensus sequence. 
Finally, a hyphen at a nucleotide position in the 
consensus sequences in Figures lA-K indicates that two 
nucleotides were found in equal numbers at that position 
in the aligned sequences. In the aligned sequences, 
nucleotides are shown in lower case letters if they 
differed from the nucleotides of both adjacent isolates. 

Figures 2A-K show computer alignments of the 
deduced amino acid sequences of amino acid sequences of 
the HVRl region of the envelope 2 gene of 49 isolates of 
HCV. The single letter abbreviations used for the amino 
acids shown in Figures 2A-K follow the conventional amino 
acid shorthand for the twenty naturally occurring amino 
acids. Figure 2A shows the alignment of SEQ ID NOs:50-57 
to produce a consensus sequence for subtype I/la, Figure 
2B shows the alignment of SEQ ID NOs: 58-74 to produce a 
consensus sequence for subtype II/ lb. Figures 2C shows 
the alignment of SEQ ID NOs:50-74 to produce a consensus 
sequence for genotype 1 where genotype 1 comprises 
subtypes la (SEQ ID NOs:50-57) and lb (SEQ ID NOs:58-74), 
Figure 2D shows the alignment of SEQ ID NOs:75-78 to 
produce a consensus sequence for subtype III/2a. Figure 
2E shows the alignment of SEQ ID NOs:79-81 to produce a 
consensus sequence for subtype IV/2b, Figure 2F shows the 
alignment of SEQ ID NOs:75-82 to produce a consensus 
sequence for genotype 2 where genotype 2 comprises 
subtypes 2a (SEQ ID NOs:75-78), 2b (SEQ ID NOs:79-81) and 
2c (SEQ ID NO:82) . Figure 2G shows the alignment of SEQ 
ID NOs:83-87 to produce a consensus sequence for genotype 
V/3a» Figure 2H shows the computer alignment of SEQ ID 
35 NOs:90-91 to produce a consensus sequence for subtype 4c. 
Figure 21 shows the alignment of SEQ ID NOs: 88-92 to 
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produce a consensus sequence for genotype 4 where genotype 
4 comprises subtypes 4a (SEQ ID NO:88), 4b (SEQ ID NO:89), 
4c {SEQ ID NOs:90-91) and 4d (SEQ ID NO:92). Figure 2J 
shows the alignment of SEQ ID NOs:93-97 to produce a 
consensus sequence for genotype 5a. Figure 2K shows the 
alignment of the HVRl aonino acid sequences of the 49 HCV 
isolates (SEQ ID NOs : 50-98) to produce a consensus 
sequence for all genotypes. The amino acids shown in 
capital letters in the consensus sequences of Figures 2A-K 
are those conserved within a genotype (Figures 2A-J) or 
among all isolates (Figure 2K) while amino acids shown in 
lower case letters in the consensus sequences are those 
variable within a genotype (Figures 2A-J) or among all 
isolates (Figure 2K) . In addition, when the lower case 
letter is shown in a consensus sequence, the letter 
represents the amino acid found most frequently in the 
sequences aligned to produce the consensus sequence. 
Finally, a hyphen at an amino acid position in the 
consensus sequences of Figures 2A-K indicates that two 
amino acids were found in equal numbers at that position 
in the aligned sequences. In the aligned sequences, amino 
acids are shown in lower case letters if they differed 
from the amino acids of both adjacent isolates. 

Detailed Description Of Invention 
The present invention relates to nucleotide and 
deduced amino acid sequences of hypervariable region 1 
(HVRl) of the E2 gene of 49 isolates of human hepatitis C 
virus (HCV) where HVRl is defi^ned as starting at amino 
acid 384 of the HCV polyprotein (Bukh, J. et al . (1995) 
Seminars in Liver Disease , 15': 41-63; Hijikata, M. et al . 
(1991) Biochem, Biophvs . Res. Comm. . 175: 220-228; and 
Hijikata, M. et al . (1991) Proc. Na c l. Acad. Sci. U.S.A. . 
88: 5547-5551) The nucleic acid sequences of the present 
invention were obtained as follows. Viral RNA was 
extracted from serum collected from humans infected with 
35 hepatitis C virus and the viral RNA was then reverse 

transcribed and amplified by polymerase chain .reaction 



20 



25 



30 



BNSOOaD: <m_m07B4A%JL> 



wo 96/40764 PCT/US96/09340 



o 

using primers deduced from the sequence of the HCV strain 
H-77 (Bukh, et al . (1993) Proc . Natl. Acad. Sci . U.S.A. . 
90:8234-8238). The amplified cDNA was then isolated by 
gel electrophoresis and secpienced. 

The HVRl nucleotide sequences of the 49 HCV 

^ isolates are shown in the sequence listing as SEQ ID N0:1 
through SEQ ID NO: 49. 

The abbreviations used for the nucleotides are 
those standardly used in the art. 

The deduced amino acid sequence of each of SEQ 
ID N0:1 through SEQ ID NO: 49 are presented in the sequence 
listing as SEQ ID NO: 50 through SEQ ID NO: 98 where the 
amino acid sequence in SEQ ID NO: 50 is deduced from the 
nucleotide sequence shown in SEQ ID N0:1, the amino acid 
sequence shown in SEQ ID NO: 51 is deduced from the 

' nucleotide sequence shown in SEQ ID NO: 2 and so on. The 
deduced amino acid sequence of each of SEQ ID Nos:50-98 
starts at nucleotide 1 of the corresponding nucleic acid 
secfuence shown in SEQ ID NOs:l-49. 

The three letter abbreviations used in SEQ ID 

* NOs: 50-98 follow the conventional amino acid shorthand for 
the twenty naturally occurring amino acids. 

Preferably, the HVRl proteins of the present 
invention are substantially homologous to, and most 
preferably biologically equivalent to, native HCV HVRl 
proteins. For purposes of the present invention, protein 
as used herein refers to a molecule containing a complete 
amino acid secjuence shown in SEQ ID NOs 50-98 or a 
fragment of these sequences of at least about 6 to about 8 
amino acids in length. By "biologically equivalent" as 
used throughout the specification and claims, it is meant 
that the compositions are immunogenically equivalent to 
the native HVRl proteins. The HVRl proteins of the 
present invention may also stimulate the production of 
protective antibodies upon injection into a mammal that 
would serve to protect the mammal upon challenge with HCV. 
By "substantially homologous" as used throughout the 
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ensuing specification and claims to describe HVRl 
proteins, it is meant a degree of homology in the amino 
acid sequence of the HVRl proteins to the native HVRi 
amino acid sequences disclosed herein. Preferably the 
degree of homology is in excess of 80%, preferably in 
excess of 90%, with a particularly preferred group of 
proteins being in excess of 95% homologous with the native 
HVRl amino acid sequences . 

Variations are contemplated in the nucleic acid 
sequences shown in SEQ ID N0:1 through SEQ ID NO:49 which 
will result in a nucleic acid sequence that is capable of 
directing production of a protein having at least six 
contiguous amino acids shown in SEQ ID NO: 50 through SEQ 
ID NO: 98 or an analog thereof. Due to the degeneracy of 
the genetic code, it is to be understood that numerous 
choices of nucleotides may be made that will lead to a DNA 
sequence capable of directing production of the instant 
protein or its analogs. As such, DNA sequences which are 
functionally equivalent to the sequences set forth above 
or which are functionally equivalent to sequences that 
would direct production of HVRl amino acid sequences set 
forth in SEQ ID NOs:50-98 or analog thereof are intended 
to be encompassed within the present invention. 

The term analog as used throughout the 
specification or claims to describe the HVRl proteins of 
the present invention, includes any protein having an 
amino acid residue sequence substantially identical to a 
sequence specifically shown herein in which one or more 
residues have been conservatively substituted with a 
biologically equivalent residue. Examples of conservative 
substitutions include the substitution of one polar 
(hydrophobic) residue such as isoleucine, valine, leucine 
or methionine for another, the substitution of one polar 
(hydrophilic) residue for another such as between arginine 
and lysine, between glutamine and asparagine, between 
35 glycine and serine, the substitution of one basic residue 
such as lysine, arginine or histidine for another, or the 
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substitution of one acidic residue, such as aspartic acid 
or glutamic acid for another. 

The phrase "conservative substitution" also 
includes the use of a chemically derivatized residue in 
place of a non-derivatized residue provided that the 
resulting protein is biologically equivalent to the native 
HVRl protein. 

"Chemical derivative" refers to an HVRl protein 
having one or more residues chemically derivatized by 
reaction of a functional side group. Examples of such 
derivatized molecules, include but are not limited to, 
those molecules in which free amino groups have been 
derivatized to form amine hydrochlorides, p- toluene 
sulfonyl groups, carbobenzoxy groups, t -butyl oxycarbonyl 
groups, chloracetyl groups or formyl groups. Free 
carboxyl groups may be derivatized to form salts, methyl 
and ethyl esters or other types of esters or hydrazides. 
Free hydroxyl groups may be derivatized to form 0-acyl or 
O-alkyl derivatives. The imidazole nitrogen of histidine 
may be derivatized to form N- imbenzylhistidine . Also 
included as chemical derivatives are those proteins which 
contain one or more naturally- occurring amino acid 
derivatives of the twenty standard amino acids. For 
examples: 4 -hydroxyproline may be substituted for 
proline; 5 - hydroxy lysine may be substituted for lysine; 3- 
methylhistidine may be substituted for histidine; 
homoserine may be substituted for serine; and ornithine 
may be substituted for lysine. The HVRl proteins of the 
present invention also include any protein having one or 
more additions and/or deletions of residues relative to 
the sequence of a peptide whose sequence is shown herein, 
so long as the protein is biologically equivalent to the 
native ITVRI protein. 

The present invention also relates to multiple 
computer- generated alignments of the nucleotide and 
35 deduced amino acid sequences shown in SEQ ID NOs:l-98. 

The grouping of SEQ ID NOs:l-49 into HCV 
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genotypes is shown below. 

SEP ID NQs: Subtypes Genotvx>es 





1-8 


I/la 1 




5 


9-25 


Il/lb J 


1 




26-29 


III/2a n 






on "a o 
J U - 


IV/2b 












2 




33 


2c 






10 


34-38 


V/3a 


3 




39 


4a -1 








40 


4b 








41-42 


4c 




4 


15 


43 


4d - 








44-48 


5a 




5 




49 


6a 




6 



20 those subtypes or genotypes containing more 

than one HVRl nucleotide sequence, computer alignment of 
the constituent nucleotide sequences of the subtype or 
genotype was conducted using the program GENTU^IGN 
( Intelligenetics Inc. Mountainview, CA) in order to 

25 produce a consensus sequence. These alignments and their 
resultant consensus sequences are shown in Figures lA-lJ. 
Further alignment of the sequences of all 49 HVRl 
sequences to produce a consensus sequence for all 
genotypes is shown in Figure IK. The consensus sequences 

2Q shown in Figures lA-K ser^/e to highlight regions of 

homology and non- homology between sequences found within 
the same subtype or genotype or in different genotypes and 
hence, these alignments can be used by one skilled in the 
art to select HVRl sequences useful as reagents in 
diagnostic assays or vaccines. 

The grouping of SEQ ID NOs: 50-98 into HCV 
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genotypes is shown below: 

SEP ID NOs: Subtypes Genotypes 



5 



10 



15 



50-57 


I/la 1 


58-74 


Il/lb -1 


75-78 


III/2a -I 


79-81 


IV/2b 


82 


2c 




83-87 


V/3a 


88 


4a - 




89 


4b 




90-91 


4c 




92 


4d - 




93-97 


5a 




98 


6a 





1 



For those subtypes or genotypes containing more 
than one HVRl amino acid sequence, computer alignment of 
the constituent sequences of each subtype or genotype was 
conducted using the computer program GENALIGN in order to 
produce a consensus sequence. These alignments and their 
resultant consensus sequences are shown in Figures 2A-J. 
Alignment of all 49 HVRl sequences to produce a consensus 
amino acid sequence for all genotypes is shown in Figure 
2K. The consensus sequences shown in Figures 2A-2K serve 
to highlight regions of homology and non- homology between 
HVRl amino acid sequences of the same subtype or genotype 
and of different genotypes and hence, these alignments can 
readily be used by those skilled in the art to design HVRl 
proteins useful in assays and vaccines for the diagnosis 
and prevention of HCV infection. 

In order to identify hydrophilic domains within 
HVRl that might represent antigenic determinants, a Kyte 
and Doolittle analysis (Kyte, J. and Doolittle, R.F. 
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(1982) J. Mol. Biol, - 157:105-132) of each of the amino 
acid sequences shown in SEQ ID NOS: 50-98 was conducted. 
The observed hydrophilic domains for the aimino acid 
sequences of each of these isolates is shown below where 
amino acid position 1 is the amino- terminal amino acid of 
the HVRl amino acid sequences shown in SEQ ID NOs:50-98. 

(Note that all the amino acid sequences shown in SEQ ID 
NOs: 50-98 are 32 amino acids in length except for SEQ ID 
NOs 58 and 59 (isolates Dl and D3 respectively) which are 
36 amino acids in length due to the presence of an 
additional four amino acids in their amino termini and SEQ 
ID NO 9 8 which is lacking a single amino terminal amino 
acid relative to SEQ ID NOs: 50-57 and 60-97 and five 
amino terminal amino acids relative to SEQ ID NOs 58 and 
59. Thus in the table below, the first four amino acids 
of SEQ ID NOs 58 and 59 are represented by the numbers -4, 
-3, -2 and -1 while the first amino acid in SEQ ID NO: 98 
(isolate HK2) is assigned the number 2) . 
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Isolate 


amino 


acid 


position of 


HVR 5-*3 


6a 


HK2 


2-6 






9-13 


23-28 


5a 


SA6 


1-5 






9-14 


22-28 


5a 


SA13 


1-5 






9-13 


22-28 


5a 


SAl 


1-4 






11-15 


22-28 


5a 


SA7 


1-2 






11-14 


23-28 


5a 


SA4 


1-5 






9-13 


23-28 


4c 


Z6 


1-4 






9-15 


22-28 


4b 


Zl 


1-4 






9-14 


23-28 


4a 


Z4 


1-4 






7-13 


22-28 


3a 


S2 


1-5 






9-14 


23-28 


3a 


S52 


1-5 






12-15 


23-28 


2c 


S83 


1-5 






9-15 


22-28 


2b 


T8 


1-6 






9-13 


22-28 


lb 


T3 


1-4 






11-14 


23-28 


lb 


HK4 


1-4 






9-16 


23-28 


lb 


HK3 


1-4 






10-16 


23-28 


lb 


S9 


1-2 






8-14 


23-28 


lb 


IND8 


1-2 






7-16 


23-28 


lb 


TIO 


1-5 






9 - 14 


23-28 


lb 


DKl 


1-3 






8-14 


23-28 


lb 


PIO 


1-6 






12-16 


23-28 


la 


S18 


1-5 






8-16 


23-28 


la 


SWl 


1-5 






9-13 


23-28 


la 


S14 


1-3 






8-13 


23-28 


la 


US 11 


1-4 






8-10 


23-28 


3a 


S54 


1-6 






9-16 


23-28 


lb 


IND5 




1 


-14 




22-28 


la 


DRl 




1 


-12 




22-28 


lb 


D3 


-4-»l 






9-13 


23-28 


lb 


HK8 


1-4 






9-15 


23-28 


la 


DK9 


1-5 






9-14 


23-28 


lb 


SAIO 




1 


-13 




23-28 


lb 


S45 




1 


-13 




23-27 
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Type 


Isolate 


amino acid 


position of 


HVR 5^3 


Id 


Dl 


-4-*14 




23-28 


lb 


SW2 


1-15 




23-28 


2a 


T2 


1-14 




23-28 


2a 


T9 


1-13 




23-28 


2b 


DK8 


1-14 




23-28 


la 


DK7 


1-5 


8-9 


23-28 


la 


DR4 


1-5 


9-12 


22-28 


lb 


US 6 


1-4 


8-16 


22-28 


lb 


HK5 


1-2 


9-16 


23-28 


2a 


T4 


1-2 


12-15 


23-28 


2a 


USIO 


1-6 


9-10 


23-28 


3a 


HKIO 




9-13 


23-28 


4d 


DK13 




7-13 


22-28 


4c 


Z7 




12-13 


23-28 


3a 


DK12 


1-14 




23-28 


2b 


DKll 


1-4 


12-13 


22-28 



The data presented above illustrate that there 
20 are typically 3 hydrophilic domains present in the HVRl 
amino acid sequences shown in SEQ ID NOs: 50-98. These 
hydrophilic domains are located at the amino and carboxy 
termini of HVRl and in roughly the middle of HVRl . 
Although all three of these hydrophilic domains may 

25 represent important antigenic determinants, the carboxy 
terminal hydrophilic domain of about 6 amino acids in 
length is of particular interest in that it is universally 
conserved in the amino acid sequences shown in SEQ. ID 
NOs: 50-98. This conservation of the C- terminal 

30 hydrophilic domain suggests that this domain may not only 
be an immunodominant epitope for HCV but may also play an 
important role in the viral life cycle. Thus, amino acid 
sequences containing the C- terminal hydrophilic domains of 
f^EQ ID NOs: 50-98 are preferred immunogens in the vaccines 

35 of the present invention. 

Accordingly, the present invention includes a 
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recombinant DNA method for the manufacture of HVRl 
proteins in which natural or synthetic nucleic acid 
sequences may be used to direct the production of HVRl 
proteins having at least six contiguous amino acids 
contained in the amino acid sequences shown in SEQ ID 
NOs : 50-98 , 

In one embodiment of the invention, the method 

comprises : 

(a) preparation of a nucleic acid sequence 
capable of directing a host organism to produce HVRl 
protein; 

(b) cloning the nucleic acid sequence into a 
vector capable of being transferred into and replicated in 
a host organism, such vector containing operational 
elements for the nucleic acid sequence; 

(c) transferring the vector containing the 
nucleic acid and operational elements into a host organism 
capable of expressing the protein; 

(d) culturing the host organism under 
conditions appropriate for amplification of the vector and 
expression of the protein; and 

(e) harvesting the protein. 

In another embodiment of the invention, the 
method for the recombinant DNA synthesis of an HCV HVRl 
protein encoded by any one of the nucleic acid sequences 
shown in SEQ ID NOs:l-49 comprises: 

(a) culturing a transformed or transfected host 
organism containing a nucleic acid sequence capable of 
directing the host organism to produce HVRl protein, under 
conditions such that the protein is produced, said protein 
exhibiting substantial homology to a native HVRl protein 
having an amino acid sequence according to any one of the 
amino acid sequences shown in SEQ ID NOs: 50-98. 

In one embodiment, the RNA sequence of an HCV 
isolate was isolated and converted to cDNA as follows. 
35 Viral RNA was extracted from a biological sample collected 
from human subjects infected with hepatitis C and the 
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viral RNA is then reverse transcribed and cunplified by 
polymerase chain reaction using primers deduced from the 
sequence of HCV strain H-77 as described in Bukh et al . 
((1993) Proc. Natl. Acad. Sci. USA , 90:8234-8238). Once 
amplified, the PGR fragments are isolated by gel 
electrophoresis and sequenced. This approach was used to 
obtain the nucleic acid sequences shown in SEQ ID NOs:l- 
49. In an alternative embodiment, a nucleic acid sequence 
capable of directing host organism synthesis of the given 
HVRl protein may be synthesized chemically and inserted 
into an expression vector. 

The vectors contemplated for use in the present 
invention include any vectors into which a nucleic acid 
sequence as described above can be inserted, along with 
any preferred or required operational elements, and which 
vector can then be subsequently transferred into a host 
organism and replicated in such organisms. Preferred 
vectors are those whose restriction sites have been well 
documented and which contain the operational elements 
preferred or required for transcription of the nucleic 
acid sequence. 

The "operational elements" as discussed herein 
include at least one promoter, at least one operator, at 
least one leader sequence, at least one terminator codon, 
and any other DNA sequences necessary or preferred for 
appropriate transcription and subsequent translation of 
the vector nucleic acid. In particular, it is 
contemplated that such vectors will contain at least one 
origin of replication recognized by the host organism 
along with at least one selectable marker and at least one 
promoter sequence capable of initiating transcription of 
the nucleic acid sequence. 

In construction of the recombinant expression 
vectors of the present invention, it should additionally 
be noted tiiat multiple copies of the nucleic acid sequence 
35 of interest and its attendant operational elements may be 
inserted into each vector. In such an embodiment, the 
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host organism would produce greater amounts per vector of 
the desired HVRl protein. The number of multiple copies 
of the nucleic acid sequence which may be inserted into 
the vector is limited only by the ability of the resultant 
vector due to its size, to be transferred into and 
replicated and transcribed in an appropriate host 
microorganism. 

Of course, those of ordinary skill in the art 
would readily understand that multiple copies of different 
HVRl nucleic acid sequence may be inserted into a single 
vector such that a host organism transformed or 
transfected with said vector would produce multiple HVRl 
proteins. For example, a polycistrionic vector in which 
multiple different HVRl proteins may be expressed from a 
single vector is created by placing expression of each 
protein under control of an internal ribosomal entry site 
(IRES) (Molla, A. et al . Nature . 356:255-257 (1992); Gong, 
S.K. et al. J. of Virol , . 263:1651-1660 (1989)). 

In another embodiment, restriction digest 
fragments containing a sequence coding for HVRl proteins 
can be inserted into a suitable expression vector that 
functions in prokaryotic or eukaryotic cells. By suitable 
is meant that the vector is capable of carrying and 
expressing a complete nucleic acid sequence coding for an 
HVRl protein. Preferred expression vectors are those that 
function in a eukaryotic cell. Examples of such vectors 
include, but are not limited to, plasmid, vaccinia virus, 
adenovirus, retrovirus or herpes virus vectors. 

In yet another embodiment, the selected 
recombinant expression vector may then be transfected into 
a suitable eukaryotic cell system for purposes of 
expressing the recombinant protein. Such eukaryotic cell 
systems include but are not limited to cell lines such as 
HeLa, MRC-5 or CV- 1 or other monkey kidney cell 
substrates . 

35 The expressed recombinant protein may be 

detected by methods known in the art including, but not 
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limited to, Coomassie blue staining and Western blotting. 

The present invention also relates to 
substantially purified and isolated recombinant HVRl 
proteins. In one embodiment, the expressed recombinant 
protein can be obtained as a crude lysate or it can be 
purified by standard protein purification procedures known 
in the art which may include differential precipitation, 
molecular sieve chromatography, ion- exchange 
chromatography, isoelectric focusing, gel electrophoresis 
and affinity and immunoaf f inity chromatography. The 
recombinant protein may be purified by passage through a 
column containing a resin which has bound thereto 
antibodies specific for HVRl protein. 

Alternatively, those of ordinary skill in the 
art would be aware that the proteins of the present 
invention or analogs thereof can be synthesized by 
automated instruments sold by a variety of manufacturers 
or can be commercially custom- ordered and prepared. The 
term analog has been described earlier in the 
specification and for purposes of describing the proteins 
of the present invention, analogs can further include 
branched, cyclic or other non- linear arrangements of the 
amino acid secjuences of the present invention. 

The present invention therefore relates to the 
use of recombinant or synthetic HVRl proteins as 
25 diagnostic agents and vaccines. In one embodiment, the 

proteins of this invention can be used in immunoassays for 
diagnosing or prognosing hepatitis C in a mammal. For the 
purposes of the present invention, "mammal" as used 
throughout the specification and claims, includes, but is 
30 not limited to humans, chimpanzees, other primates and the 
like. In a preferred embodiment, the immunoassay is 
useful in diagnosing hepatitis C infection in humans. 

Immunoassays of the present invention may be 
those commonly used by those skilled in the art including, 
35 but not limited to, radioimmunoassay. Western blot assay, 
immunof luorescent assay, enzyme immunoassay. 
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chemi luminescent assay, immunohistochemical assay, 
immunoprecipitation and the like. Standard techniques 
known in the art for ELISA are described in Methods in 
Immuno diaanosis . 2nd Edition, Rose and Bigazzi, eds . , John 
Wiley and Sons, 1980 and Campbell et al . , Methods of 
Immunology. W.A. Benjamin, Inc., 1964, both of which are 
incorporated herein by reference. Such assays may be a 
direct, indirect, competitive, or noncompetitive 
immunoassay as described in the art (Oellerich, M. 1984. 
J. Clin. Chem. Clin. BioChem 22:895-904) Biological 
samples appropriate for such detection assays include, but 
are not limited to serum, liver, saliva, lymphocytes or 
other mononuclear cells. 

In a preferred embodiment, test serum is reacted 
with a solid phase reagent having surf ace -bound 
recombinant HVRl protein (s) as antigen (s) . The solid 
surface reagent can be prepared by known techniques for 
attaching protein to solid support material. These 
attachment methods include non-specific adsorption of the 
protein to the support or covalent attachment of the 
protein to a reactive group on the support. After 
reaction of the antigen with anti-HCV antibody, unbound 
serum components are removed by washing and the antigen- 
antibody complex is reacted with a secondary antibody such 
as labelled anti- human antibody. The label may be an 
enzyme which is detected by incubating the solid support 
in the presence of a suitable fluorimetric or calorimetric 
reagent. Other detectable labels may also be used, such 
as radiolabels or colloidal gold, and the like. 

The HCV HVRl proteins and analogs thereof may be 
prepared in the form of a kit, alone, or in combinations 
with other reagents such as secondary antibodies, for use 
in immunoassays. It is understood by those of ordinary 
skill in the art that due to the variability between HVRl 
amino acid sequences between genotypes, the use of a 
single HVRl protein as an antigen in the above -described 
immunoassays may be useful in detecting a single genotype 
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of HCV. Alternatively, the use of HVRl proteins of 
multiple genotypes as antigens in the above -described 
immunoassays can serve as universal probes capable of 
detecting all genotypes of HCV. 

In yet another embodiment, the HVRl proteins or 
analogs thereof can be used as a vaccine to protect 
mammals against challenge with hepatitis C. The vaccine, 
which acts as an immunogen, may be a cell, cell lysate 
from cells transfected with a recombinant expression 
vector or a culture supernatant containing the expressed 
protein. Alternatively, the immunogen is a partially or 
substantially purified recombinant protein or a chemically 
synthesized protein. In a preferred embodiment, HVRl 
proteins having amino acid sequences found in multiple HCV 
isolates from different genotypes are administered 
together to provide protection against challenge with 
multiple isolates of HCV or a synthetic protein. 

While it is possible for the immunogen to be 
administered in a pure or substantially pure form, it is 
preferable to present it as a pharmaceutical composition, 
formulation or preparation. 

The formulations of the present invention, both 
for veterinary and for human use, comprise an immunogen as 
described above, together with one or more 

pharmaceutically acceptable carriers and optionally other 
therapeutic ingredients. The carrier (s) must be 
"acceptable" in the sense of being compatible with the 
other ingredients of the formulation and not deleterious 
to the recipient thereof. The formulations may 
conveniently be presented in unit dosage form and may be 
prepared by any method well-known in the pharmaceutical 
art . 

All methods include the step of bringing into 
association the active ingredient with the carrier which 
constitutes one or more accessory ingredients. In 
general, the formulations are prepared by uxTiformly and 
intimately bringing into association the active ingredient 
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with liquid carriers or finely divided solid carriers or 
both, and then, if necessary, shaping the product into the 
desired formulation . 

Formulations suitable for intravenous 
intramuscular, subcutaneous, or intraperitoneal 
administration conveniently comprise sterile aqueous 
solutions of the active ingredient with solutions which 
are preferably isotonic with the blood of the recipient. 
Such formulations may be conveniently prepared by 
dissolving the solid active ingredient in water 
containing physiologically compatible substances such as 
sodium chloride (e.g. 0.1-2.0 M) , glycine, and the like, 
and having a buffered pH compatible with physiological 
conditions to produce an aqueous solution, and rendering 
said solution sterile. These may be present in unit or 
multi-dose containers, for example, sealed ampules or 
vials . 

The formulations of the present invention may 
incorporate a stabilizer. Illustrative stabilizers are 
preferably incorporated in an amount of 0.10-10,000 parts 
by weight per part by weight of immunogens . If two or 
more stabilizers are to be used, their total amount is 
preferably within the range specified above. These 
stabilizers are used in aqueous solutions at the 
appropriate concentration and pH. The specific osmotic 
pressure of such aqueous solutions is generally in the 
range of 0.1-3.0 osmoles, preferably in the range of 0.8- 
1.2. The pH of the aqueous solution is adjusted to be 
within the range of 5.0-9,0, preferably within the range 
of 6-8. In formulating the immunogen of the present 
invention, an ant i -adsorption agent may be used. 

Additional pharmaceutical methods may be 
employed to control the duration of action. Controlled 
release preparations may be achieved through the use of 
polymer to complex or adsorb the proteins or their 
derivatives. The controlled delivery may be exercised by 
selecting appropriate macromolecules (for example 
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polyester, polyamino acids, polyvinyl pyrrolidone, 
ethylenevinylacetate, methyl cellulose , 

carboxymethylcellulose, or protamine sulfate) and the 
concentration of macromolecules as well as the methods of 
incorporation in order to control release. Another 
possible method to control the duration of action by 
controlled- release preparations is to incorporate the 
proteins, protein analogs or their functional derivatives, 
into particles of a polymeric material such as polyesters, 
polyamino acids, hydrogels, poly (lactic acid) or ethylene 
vinylacetate copolymers. Alternatively, instead of 
incorporating these agents into polymeric particles, it is 
possible to entrap these materials in microcapsules 
prepared, for example, by coacervation techniques or by 
interfacial polymerization, for example, 

hydroxymethyl cellulose or gelatin-microcapsules and poly 
(methylmethacylate) microcapsules, respectively, or in 
colloidal drug delivery systems, for example, liposomes, 
albumin microspheres, microemulsions , nanoparticles , and 
nanocapsules or in macroemulsions . 

When oral preparations are desired, the 
compositions may be combined with typical carriers, such 
as lactose, sucrose, starch, talc, magnesium stearate, 
crystalline cellulose, methyl cellulose, carboxymethyl 
cellulose, glycerin, sodium alginate or gum arabic among 
others . 

Vaccination can be conducted by conventional 
methods. For example, the immunogen or immunogens can be 
used in a suitable diluent such as saline or water, or 
complete or incomplete adjuvants. Further, the 
immunogen (s) may or may not be bound to a carrier to make 
the protein (s) immunogenic. Examples of such carrier 
molecules include but are not limited to bovine serum 
albumin (BSA) , keyhole limpet hemocyanin (KLH) , tetanut' 
toxoid, and the like. The immunogen (s) can be 
35 administered by any route appropriate for antibody 
production such as intravenous, intraperitoneal. 
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intramuscular, subcutaneous, and the like. The 
immunogen ( s ) may be administered once or at periodic 
intervals until a significant titer of anti-HCV antibody 
is produced. The antibody may be detected in the serum 
using an immunoassay. Doses of HVRl protein (s) effective 
to elicit a protective antibody response against HCV 
infection range from about 0.1 to about 100 /xg with a more 
preferred range being about 2 to about 20 fig. 

In yet another embodiment, the immunogen may be 
a nucleic acid sequence or sequence capable of directing 
host organism synthesis of HVRl protein (s) . Such nucleic 
acid sequence (s) may be inserted into a suitable 
expression vector by methods known to those skilled in the 
art. Expression vectors suitable for producing high 
efficiency gene transfer in vivo include retroviral, 
adenoviral and vaccinia viral vectors. Operational 
elements of such expression vectors are disclosed 
previously in the present specification and are known to 
one skilled in the art. Such expression vectors can be 
administered intravenously , intramuscularly , 
intradermally , subcutaneously , intraperitoneally or 
orally. 

In an alternative embodiment, direct gene 
transfer may be accomplished via intramuscular injection 
of, for example, plasmid-based eukaryotic expression 
vectors containing a nucleic acid sequence capable of 
directing host organism synthesis of HVRl protein(s). 
Such an approach has previously been utilized to produce 
the hepatitis B surface antigen in vivo and resulted in an 
antibody response to the surface antigen (Davis, H.L. et 
30 al. (1993) Human Molecular Genetics . 2:1847-1851; see also 
Davis et al . (1993) Human Gene Therapy , 4:151-159 and 733- 
740) . In a preferred embodiment, HVRl nucleic acid 
sequences of isolates from multiple genotypes of HCV are 
administered together to provide protection against 
35 challenge with multiple genotypes of HCV, 

Doses of HVRl protein (s) -encoding nucleic acid 
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sequence effective to elicit a protective antibody- 
response against HCV infection range from about 0.5 to 
about 5000 /zg. A more preferred range being about 10 to 
about 1000 /zg. 

The HVRl proteins and expression vectors 
containing a nucleic acid sequence capable of directing 
host organism synthesis of HVRl protein (s) may be supplied 
in the foinm of a kit, alone, or in the form of a 
pharmaceutical composition as described above. 

The nucleic acid sequences of the present 
invention or primers/probes derived therefrom can also be 
used to analyze the RNA of a mammal for the presence of 
specific hepatitis C virus isolates. 

The RNA to be analyzed can be isolated from 
serum, liver, saliva, lymphocytes or other mononuclear 
cells as viral RNA, whole cell RNA or as poly (A) RNA. 
Whole cell RNA can be isolated by methods known to those 
skilled in the art. Such methods include extraction of 
RNA by differential precipitation (Birnbiom, H.C. (1988) 
Nucleic Acids Res., 16:1487-1497), extraction of RNA by 
organic solvents (Chomczynski , P. et al . (1987) Anal. 
Biochem. , 162:156-159) and extraction of RNA with strong 
denaturants (Chirgwin, J.M. et al . (1979) Biochemistry, 
18:5294-5299), Poly (A)"' RNA can be selected from whole 
cell RNA by affinity chromatography on oligo-d(T) columns 
(Aviv, H. et al. (1972) Proc. Natl. Acad. Sci., 69:1408- 
1412) or Poly(U) RNA can be selected by affinity 
chromatography on oligo-d(A) columns. A preferred method 
of isolating RNA is extraction of viral RNA by the 
guanidinivim-phenol - chloroform method of Bukh et al . 
30 (1992a) . 

The methods for analyzing the RNA for the 
presence of HCV include, but are not limited to. Northern 
blotting (Alwine, J.C. et al . (1977) Proc. Natl. Acad. 
Sci., 74:5350-5354), dot and slot blot hybridization 
35 (Kafatos, F.C. et al , (1979) Nucleic Acids Res., 7:1541- 

1522), filter hybridization (Hollander, M.C. et al . (1990) 
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Biotechniques; 9:174-179), RNase protection (Sambrook, J. 
et al . (19 89) in "Molecular Cloning, A Laboratory Manual", 
Cold Spring Harbor Press, Plainview, NY) and reverse - 
transcription polymerase chain reaction (RT-PCR) (Watson, 
J.D. et al. (1992) in "Recombinant DNA" Second Edition, 
W.H, Freeman and Company, New York) . 

A preferred method for analyzing the RNA is RT- 
PCR. In this method, the RNA can be reverse transcribed 
to first strand cDNA using a primer or primers derived 
from the nucleotide sequences shown in SEQ ID NOs:l-49 or 
sequences complementary to those. Once the cDNAs are 
synthesized, PCR amplification is carried out using pairs 
of primers designed to hybridize with sequences in the 
hypervariable region which are an appropriate distance 
apart (at least about 50 nucleotides) to permit 
amplification of the cDNA and subsequent detection of the 
amplification product. Each primer of a pair is a single- 
stranded oligonucleotide of about 15 to about 40 bases in 
length with a more preferred range being about 2 0 to about 
30 bases in length where one primer (the "upstream" 
primer) is complementary to the original RNA and the 
second primer (the "downstream" primer) is complementary 
to the first strand of cDNA generated by reverse 
transcription of the RNA. Optimization of the 
amplification reaction to obtain sufficiently specific 
hybridization to the nucleotide sequence of interest is 
well within the skill in the art and is preferably 
achieved by adjusting the annealing temperature. 

The amplification products of PCR can be 
detected either directly or indirectly. In one 
embodiment, direct detection of the amplification products 
is carried out via labelling of primer pairs. Labels 
suitable for labelling the primers of the present 
invention are known to one skilled in the art and include 
radioactive labels, biotin, avidin, enzymes and 
35 fluorescent molecules. The derived labels can be 

incorporated into the primers prior to performing the 
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amplification reaction. A preferred labelling procedure 
utilizes radiolabeled ATP and T4 polynucleotide kinase 
(Sambrook, J. et al . (1989) in "Molecular Cloning, A 
Laboratory Manual", Cold Spring Harbor Press, Plainview, 
NY) , Alternatively, the desired label can be incorporated 
into the primer extension products during the 
amplification reaction in the form of one or more labelled 
dNTPs . In the present invention, the labelled amplified 
PCR products can be detected by agarose gel 
electrophoresis followed by ethidium bromide staining and 
visualization under ultraviolet light or via direct 
sequencing of the PCR-products . 

In yet another embodiment, unlabelled 
amplification products can be detected via hybridization 
with labelled nucleic acid probes radioactively labelled 
or, labelled with biotin, in methods known to one skilled 
in the art such as dot and slot blot hybridization 
(Kafatos, F.C. et al . (1979) or filter hybridization 
(Hollander, M.C. et al . (1990)), 

In one embodiment, the nucleic acid sequences 
used as probes are selected from, and substantially 
homologous to, SEQ ID NOs:l-49. In an alternative 
embodiment, the sequence alignments shown in Figures lA-lK 
may be used to design hybridization probes. 

The nucleic acid sequence used as a probe to 
detect PCR amplification products of the present invention 
can be labeled in single- stranded or double- stranded form. 
Labelling of the nucleic acid sequence can be carried out 
by techniques known to one skilled in the art. Such 
labelling techniques can include radiolabels and enzymes 
(Sambrook, j. et al . (1989) in "Molecular Cloning, A 
Laboratory Manual", Cold Spring Harbor Press, Plainview, 
New York) . In addition, there are known non- radioactive 
techniques for signal amplification including methods for 
attaching chemical moieties to pyrimidine and purine rings 
35 (Dale, R.N.K. et al . (1973) Proc . Natl. Acad. Sci. . 

70:2238-2242; Heck, R.F. (1968) S. Am. Chem. Soc . . 
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90:5518-5523), methods which allow detection by 
chemiluminescence (Barton, S,K. et al , (1992) J. Am. Chem. 
SQC. , 114:8736-8740) and methods utilizing biotinylated 
nucleic acid probes (Johnson, T.K. et al . (1983) Anal , 
Biochem. . 133:126-131; Erickson, P.F. et al . (1982) J. of 
Immunology Methods . 51:241-249; Matthaei, F-S. et al . 
(1986) Anal. Biochem. . 157:123-128) and methods which 
allow detection by fluorescence using commercially 
available products . 

The administration of the nucleic acid sequences 
or proteins of the present invention as immunogens may be 
for either a prophylactic or therapeutic purpose. When 
provided prophylactically , the immunogen(s) is provided in 
advance of any exposure to HCV or in advance of any 
symptom (s) due to HCV infection. The prophylactic 
administration of the immunogen serves to prevent or 
attenuate any subsequent infection of HCV in a mammal. 
When provided therapeutically, the immunogen (s) is 
provided at (or shortly after) the onset of the infection 
or at the onset of any symptom of infection or disease 
caused by HCV or at any time thereafter. The therapeutic 
administration of the immunogen (s) serves to attenuate or 
eradicate the infection or disease. 

In addition to use as a vaccine, the 
compositions can be used to prepare antibodies to the HVRl 
protein. The antibodies can be used directly as antiviral 
agents or they may be used in immunoassays disclosed 
herein to detect the presence of the Hepatitis C virus in 
patient sera.. To prepare antibodies, a host animal can 
be immunized using the HVRl proteins of the present 
invention or- expression vectors containing nucleic acid 
sequences encoding such proteins. The host serum or 
plasma is collected following an appropriate time interval 
to provide a composition comprising antibodies reactive 
with the HVRl region protein of the virus particle. The 
gamma globulin fraction or the IgG antibodies can be 
obtained, for example, by use of saturated amn^onium 
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sulfate or DEAE Sephadex, or other techniques known to 
those skilled in the art. The antibodies are 
substantially free of many of the adverse side effects 
which may be associated with other anti -viral agents such 
as drugs . 

^ The antibody compositions can be made even more 

compatible with the host system by minimizing potential 

adverse immune system responses. This is accomplished by 

removing all or a portion of the Fc portion of a foreign 

species antibody or using an antibody of the same species 

as the host animal, for example, the use of antibodies 

from human/human hybridomas. Humanized antibodies (i.e., 

nonimmunogenic in a human) may be produced, for example, 

by replacing an immunogenic portion of an antibody with a 

corresponding, but nonimmunogenic portion (i.e., chimeric 

antibodies) . Such chimeric antibodies may contain the 

reactive or antigen-binding portion of an antibody from 

one species and the Fc portion of an antibody 

(nonimmunogenic) from a different species. Examples of 

chimeric antibodies, include but are not limited to, non- 
90 

human mammal -human chimeras, rodent -human chimeras, 
murine-human and rat-human chimeras (Robinson et al . , 
International Patent Application 184,187; Taniguchi M. , 
European Patent Application 171,496; Morrison et al . , 
European Patent Application 173,494; Neuberger et al . , PCT 

25 Application WO 86/01533; Cabilly et al . , 1987 Proc . Natl. 
Acad. Sci. USA 84:3439; Nishimura et al . , 1987 Cane. Res. 
47:999; Wood et al . , 1985 Nature 314:446; Shaw et al . , 
1988 J. Natl. Cancer Inst. 80:15553, all incorporated 
herein by reference) . 

30 General reviews of "humanized" chimeric 

antibodies are provided by Morrison S., 1985 Science 
229:1202 and by Oi et al . , 1986 BioTechniques 4:214. 

Suitable "humanized" antibodies can be 
alternatively produced by CDR or CEA substitution (Jones 

35 et a2 - , 1986 Nature 321:552; Verhoeyan et al . , 1988 

Science 239:1534; Biedleret al . 1988 J. Immunol. 141:4053, 
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all incorporated herein by reference) . 

The antibodies or antigen binding fragments may 
also be produced by genetic engineering. The technology 
for expression of both heavy and light chain genes in E . 
coli is the subject of the PCT patent applications; 
publication number WO 901443, WO901443, and WO 9014424 and 
in Huse et al . , 1989 Science 246:1275-1281. 

The antibodies can also be used as a means of 
enhancing the immune response. The antibodies can be 
administered in amounts similar to those used for other 
therapeutic administrations of antibody. For example, 
normal immune globulin is administered at 0.02-0.1 ml/lb 
body weight during the early incubation period of other 
viral diseases such as rabies, measles, and hepatitis B to 
interfere with viral entry into cells. Thus, antibodies 
reactive with the HVRl proteins can be passively 
administered alone or in conjunction with another anti- 
viral agent to a host infected with an HCV to enhance the 
immune response and/or the effectiveness of an antiviral 
drug . 

Alternatively, antibodies to the HVRl region can 
be induced by administered anti- idiotype antibodies as 
immunogens . Conveniently, a purified antibody preparation 
prepared as described above is used to induce anti- 
idiotype antibody in a host animal, the composition is 
administered to the host animal in a suitable diluent. 
Following administration, usually repeated administration, 
the host produces anti - idiotype antibody. To eliminate an 
immunogenic response to the Fc region, antibodies produced 
by the same species as the host animal can be used or the 
Fc region of the administered antibodies can be removed. 
Following induction of anti- idiotype antibody in . the host 
animal, serum or plasma is removed to provide an anlibody 
composition. The composition can be purified as det^cribed 
above for anti -HVRl antibodies, or by affinity 
35 chromatography using anti -HVRl antibodies bound to the 

affinity matrix. The anti - idiotype antibodies produced or 
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similar in conformation to the authentic HVRl amino acid 
sequence may be used to prepare an HCV vaccine rather than 
using an HVRl protein. 

When used as a means of inducing ant i- HCV virus 
antibodies in an animal, the manner of injecting the 
antibody is the same as for vaccination purposes, namely 
intramuscularly, intraperitoneally , subcutaneously or the 
like in an effective concentration in a physiologically 
suitable diluent with or without adjuvant. One or more 
booster injections may be desirable. 

The HVRl proteins of the invention are also 
intended for use in producing antiserum designed for pre- 
or post-exposure prophylaxis. Here an HVRl protein, or 
mixture of HVRl proteins is formulated with a suitable 
adjuvant and administered by injection to human 
volunteers, according to known methods for producing human 
antisera. Antibody response to the injected proteins is 
monitored, during a several -week period following 
immunization, by periodic serum sampling to detect the 
presence of anti-HVRl seriim antibodies, using an 
immunoassay as described herein. 

The antiserum from immunized individuals may be 
administered as a pre- exposure prophylactic measure for 
individuals who are at risk of contracting infection. The 
antiserum is also useful in treating an individual post- 
exposure, analogous to the use of high titer antiserum 
against hepatitis B virus for post-exposure prophylaxis. 

For both in vivo use of antibodies to HVRl 
proteins and anti- idiotype antibodies and diagnostic use, 
it may be preferable to use monoclonal antibodies. 
Monoclonal anti-HVRl protein antibodies or anti - idiotype 
antibodies can be produced as follows. The spleen or 
lymphocytes from an immunized animal are removed and 
immortalized or used to prepare hybridomas by methods 
known to those skilled in the art. (Coding, J.W. 1983. 
35 Monoclonal Antibodies: Principles and Practice, Pladermic 
Press, Inc., NY, NY, pp. 56-97). To produce a human-human 
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hybridoma, a human lymphocyte donor is selected. A donor 
known to be infected with HCV (where infection has been 
shown for example by the presence of anti -virus antibodies 
in the blood or by virus culture) may serve as a suitable 
lymphocyte donor. Lymphocytes can be isolated from a 
peripheral blood sample or spleen cells may be used if the 
donor is subject to splenectomy. Epstein-Barr virus (EBV) 
can be used to immortalize human lymphocytes or a human 
fusion partner can be used to produce human -human 
hybridomas. Primary in vitro immunization with peptides 
can also be used in the generation of human monoclonal 
antibodies . 

Antibodies secreted by the immortalized cells 
are screened to determine the clones that secrete 
antibodies of the desired specificity. For monoclonal 
antibodies to the HVRl amino acid sequences disclosed 
herein, the antibodies must bind to HVRl proteins. For 
monoclonal anti - idiotype antibodies, the antibodies must 
bind to anti -HVRl protein antibodies. Cells producing 
antibodies of the desired specificity are selected. 

The present invention also relates to the use of 
single- stranded antisense poly- or oligonucleotides 
derived from nucleotide sequences substantially homologous 
to those shown in SEQ ID NOs:l-49 to inhibit the 
expression of hepatitis C E2 genes. By substantially 
homologous as used throughout the specification and claims 
to describe the nucleic acid sequences of the present 
invention, is meant a level of homology between the 
nucleic acid sequence and the SEQ ID NOs . referred to in 
the above sentence. Preferably, the level of homology is 
in excess of 80%, more preferably in excess of 90%, with a 
preferred nucleic acid sequence being in excess of 95% 
homologous with the DNA sequence shown in the indicated 
SEQ ID NO. These anti -sense poly- or oligonucleotides can 
be either DNA or RNA. The targeted sequence is typically 
messenger RNA and more preferably, a single sequence 
required for processing or translation of the RNA. The 
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anti-sense poly- or oligonucleotides can be conjugated to 

a polycation such as polylysine as disclosed in Lemaitre, 

M. et al. {(1989) Proc. Natl. Acad> Sci . USA . 84:648-652) 

and this conjugate can be administrated to a mammal in an 

amount sufficient to hybridize to and inhibit the function 

of the messenger RNA. 

Any articles or patents referenced herein are 

incorporated by reference. The following examples 

illustrate various aspects of the invention but are in no 

way intended to limit the scope thereof. 

Example 1 

Use Of HVRl Protein Or Nucleic Acid 
Sequence Encoding HVRl Protein As A Vaccine 

Mammals are immunized intradermally or 
intramuscularly with 2 to 20 fig of at least one HVRl 
protein having an amino acid sequence of at least six 
contiguous amino acids selected from the amino acid 
sequence shown in SEQ ID NOs: 50-98 or with 10 to 1000 fig 
of expression vector containing at least one nucleic acid 
having a sequence of at least 15 nucleotides selected from 
SEQ ID NOs: 1-49 to stimulate production of protective 
antibodies. Those of ordinary skill in the art would 
readily understand that the HVRl protein or the expression 
vector containing HVRl nucleic acid sequence can be used 
alone or in combination with other HVRl proteins or other 
expression vectors containing different HVRl nucleic acid 
sequences presented herein. When HVRl proteins or nucleic 
acid sequences from multiple isolates are used as 
immunogens, the immunized mammals are protected from 
challenge with multiple isolates of HCV. 

Example 2 

Use Of Antisera To The HVRl Protein 
Sequences In Pre -or Post -Exposure Prophvlaxis 

Antisera collected from a mammal injected with a 
protein having an amino acid sequence of at least six 
35 contiguous amino acids selected from the amino acid 

sequences shown in SEQ ID NOS 50-98 or, a mixture of such 
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proteins, is administered intravenously to an individual 
post -exposure to HCV or is administered to an uninfected 
mammal in an amount effective to protect against hepatitis 
C infection. Such administration is repeated one or. more 
times at monthly intervals and serves to reduce the 
^ severity of the HCV infection as indicated by, for 
example, diminished replication of HCV. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 



(i) APPLICANTS: The Government Of The United 

States Of America As 
Represented By The Secretary 
Department Of Health And Human 
Services 



(ii) TITLE OF INVENTION: NUCLEOTIDE AND DEDUCED 

AMINO ACID SEQUENCES OF HYPERVARIABLE 
REGION 1 OF THE ENVELOPE 2 GENE OF ISOLATES 
OF HEPATITIS C VIRUS AND THE USE OF 
REAGENTS DERIVED FROM THESE HYPERVARIABLE 
SEQUENCES IN DIAGNOSTIC METHODS AND 
VACCINES 



(iii) NUMBER OF SEQUENCES: 98 

15 (iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: MORGAN & FINNEGAN 

(B) STREET: 345 PARK AVENUE 

(C) CITY: NEW YORK 

(D) STATE: NEW YORK 

(E) COUNTRY: USA 

(F) ZIP: 10154 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: FLOPPY DISK 

(B) COMPUTER: IBM PC COMPATIBLE 

(C) OPERATING SYSTEM: PC - DOS /MS - DOS 

(D) SOFTWARE : WORDPERFECT 5 . 1 



(vi) CURRENT APPLICATION DATA: 

25 (A) APPLICATION NUMBER: To Be Assigned 

(B) FILING DATE: 05 -JUNE- 1996 

(C) CLASSIFICATION: 



(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 08/484,322 

(B) FILING DATE: 07-JUNE-1995 

(C) CLASSIFICATION: 

30 

(viii) ATTORNEY/ AGENT INFORMATION: 

(A) NAME: FEILER, WILLIAM S. 

(B) REGISTRATION NUMBER: 26,728 

(C) REFERENCE/DOCKET NUMBER: 2026-4116PC1 



(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (212) 758-4800 
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(B) TELEFAX: (212) 751-6849 

(C) TELEX: 421792 



10 



(2) INFORMATION FOR SEQ ID N0:1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 96 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homo sapiens 
(C) INDIVIDUAL ISOLATE: S18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 



GAC ACC TAC GCC ACT GGG GGG AGT GCC AGC AGG ACC ACG 
CAG GCG TTC ACT AGG TTC TTC TCT CCG GGC GCC AAG CAG 
GAC ATC CAG CTA ATC T^C 

15 

(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 6 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



20 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S14 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 

25 

GAC ACC TAC ATC ACC GGG GGA ACT GCC GGT CGC ACC GTG 
GGG ACA CTC AGC AAT CTC CTC GCA CCG GGC GCC AAG CAG 
AAC ATC CAG CTG ATT AAC 



(2) INFORMATION FOR SEQ ID NO : 3 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 6 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

ORIGINAL SOURCE: 
(A) ORGANISM: homosapiens 



(vi) 



35 
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(C) INDIVIDUAL ISOLATE: DK7 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

AGC ACC CAC GTC ACC GGG GGA ACT GCC GCC CGC GCT GCG 39 
TTT GGC ATT ACT AGT CTC TTT GCA CCA GGC GCC AAA CAG 78 
5 AAC ATC CAA CTG ATC AGC 96 



10 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 96 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINTIL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: USll 

15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

GAA ACC TAC GTC ACC GGG GGA AGT GCC GGC CAT GCC GCG 39 
TCT GGA CTT GCT GGT CTT TTC TCA CAA GGC GCC CAG CAG 78 
AAC ATC CAG CTG ATC AAC 9 6 

(2) INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 96 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

25 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SWl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

GAA ACC TAC ACC ACC GGG GGG GCT GCT GGT CAG ACC GCG 39 
TCT GGA TTC ACC AGT CTT TTC ACG CGG GGC GCC CAG CAG 78 
30 AAT ATC CAG CTG GTC AAC 9 6 



(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 96 base pairs 
35 (B) TYPE: nucleic acid 
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(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DK9 

5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

GAC ACC CGC GTC ACC GGG GGG AGC GCT GCC AGG AAC ACG 3 9 

TAT GGA CTC GCC AGT CTT CTC AGC CCG GGC GCC AAG CAG 78 
AAT ATT CAG CTG ATC AAC 96 

(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 6 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

15 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUTUj ISOLATE: DR4 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7 

GGC ACC CAA GTC AGC GGG GGG AGC GCC GCT CGC ACC GTG 3 9 

AAT GCA CTC GCT GGT CTC TTC GAC CAG GGC GCG CGG CAG 78 

AAT ATC CAG TTG ATC AAC 9 6 



(2) INFORMATION FOR SEQ ID NO : 8 : 

(i) SEQUENCE CHARACTERISTICS: 

25 (A) LENGTH: 96 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DRl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

ACC ACC CAT v'^TC ACT GGG GGA AGT GAA GCT CGC GCC GCG 3 9 

TCT GCA CTC ACT GGT CTC TTC ACG CGG GGC GCG CGG CAG 78 

AAC GTC CAG TTG ATC AAC 9 6 

35 
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(2) INFORMATION FOR SEQ ID NO : 9 : 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 108 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



ORIGINAL SOURCE: 
(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: D3 

SEQUENCE DESCRIPTION: SEQ ID NO : 9 : 

CGT GGA GGC GTG GGC ACC CAC ACG ATA GGG GGG GCG CAA 3 9 

GCC TAC AGC GTT AGG GGG TTC ACG TCC ATA TTT TCA ACT 78 
GGG CCG GCT CAG AAG ATC CAG CTT GTA AAC 108 



10 



(vi) 



(xi; 



15 (2) INFORMATION FOR SEQ ID NO: 10: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 108 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: Dl 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

AGT GCA TCC CCG GGC ACC CGC ACG ATA GGG GGG TCG CAA 3 9 

25 GCC AAA CAC ACT AGC AGT ATC GTG TCC ATG TTC TCA CTT 78 

GGG CCG TCT CAG AAA ATC CAG CTT GTA AAC 10 8 



(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 96 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: PIO 



35 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

CGC ACC CAC ACQ ACG GGG GGG TCG GTG GCC TAC GGC ACC 3 9 

CGC AGG TTT ACG TCC CTC TTT ACA TCT GGG GCG TCT CAG 78 

AAA ATC CAG CTT GTG AAC 9 6 



(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 6 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: TIO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 12 : 

AGC ACC CGC GTA ACA GGG GGA ACG GCA GCC CGC AAC ACC 
TAC GGG CTC GCG TCC ATC TTT GCA CCT GGG GCG TCT CAG 
AAG ATC CAG CTT ATA AAC 



39 
78 
96 



(2) INFORMATION FOR SEQ ID NO: 13: 

20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 6 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

25 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: HK5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

GCC ACC CAC GTG ACA GGG GGT ACT GCA GCC CAC ACC ACT 3 9 

CGT GGG CTC ACG TCC CTG TTC GCC CCT GGG CCT TCT CAG 78 
AAA ATC CAG CTT ATA AAT 9 6 

30 

(2) INFORMATION FOR SEQ 10 NO: 14: 

(i) SEQUENCE CHAR/^CrERISTICS : 

(A) LENGTH: -J 6 base pairs 

(B) TYPE: nucleic acid 
35 (C) STRANDEDNESi^ : single 
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(D) TOPOLOGY: linear 

(vi) ORIGINJ^L SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: HK8 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

GAT ACC TAG GTG TCA GGG GGT GCG ACA GCC CGC AAC ACT 
TAC GGG CTT ACG TCC CTC TTC ACC CCA GGG GCT GCT CAG 
AAA ATC CAG CTT ATA AAC 



(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 96 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



15 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: T3 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15 



ACA ACC CAC GTG TCA GGG GGG GTG TCG GCT CGC ACC ACC 
CAC GGG CTG GCA TCC TTC TTT TCA CCT GGG CCG TCT CAG 
AAA ATC CAG CTC GTA AAC 



(2) INFORMATION FOR SEQ ID NO: 16: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 96 base pairs 
25 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM : homosapiens 
(C) INDIVIDUAL ISOLATE: SW2 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 



AAC ACC TAC ACG ACA GGG GGA GAG GCA GCC TAC AAT ACC 
CGC GGC TTT GCG AGT ATC TTC TCA AGC GGG CCG TCT CAG 
AAA ATC CAG CTC GTA AAC 
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(2) INFORMATION FOR SEQ ID NO: 17: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 96 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

5 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SAIO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

GGG ACC TAC ACG ACA GGG GGG GCG CAA GGC CGC ACC ACC 

TCC AGC TTC GTG GGT CTC TTC ACC CCT GGG CCG TCT CAG 

AGA ATC CAG CTC GTA AAC 



(2) INFORMATION FOR SEQ ID NO: 18: 



(i) SEQUENCE CHARACTERISTICS: 

15 (A) LENGTH: 9 6 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: US6 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 



GAG ACT CAC GTG ACG GGG GGG GCG CAA GCC TAC GCC GCC 
CGC AGT TTC ACG TCT CTC TTC ACA CCT GGG TCA CGT CAG 
AAT ATC CAG CTT ATA AAC 



25 

(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 96 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

30 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: IND5 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

CAG GCC AAG ACA ATA GGG GGG CGC CAA GCC CAC ACC ACC 39 
GGG CGC CTT GTG TCT ATG TTC ACC CCT GGG CCG TCC CAG 78 
AAC ATC CAG CTT GTA AAC 96 



5 (2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 96 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: IND8 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

CAC ACC AAC ATA ATA GGG GGG AGG GAA GCC TCC ACC ACC 39 
15 CAA GGC TTT ACG AGT CTT TTC AGC CCT GGA GCG TCC CAG 78 
AAA ATC CAG CTT GTA AAC 96 



20 



25 



(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 96 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: HK3 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

AGC ACC CAC ACG ATA GGG GCA ACT GTG GCC CGC ACC ACT 39 
CAG AGT TGG ACG GGC TTC TTC AGC TCC GGG CCC TCT CAG 78 
AAA ATC CAG CTT ATA AAT 96 



(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 96 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S9 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:22: 

GGC ACC ACC GTG ACG GGA GCG GTG CAA GGC CGT TCC CTC 39 
CAA GGG CTC ACT GGC CTT TTT TCC TCT GGA CCG ACT CAG 78 
AAA CTC CAG CTT GTA AAT 9 6 



(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 6 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

15 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: HK4 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:23: 

AAC ACC TAC GTG ACA GGG GGG GCG GCA AGC CAT TCC ACC 3 9 

CGA GGG CTC ACG TCC CTT TTC ACA ACG GGG GCG TCT CAG 7 8 

AAA ATC CAG CTT ATA AAC 9 6 



(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 6 base pairs 

(B) TYPE: nucleic acid 
25 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM : homosapiens 

(C) INDIVIDUAL ISOLATE: S4 5 



30 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

GGT ACC TAC ACG TCG GGG CAG GCG GCG GGC CGC ACC ACC 3 9 

GCC GGG TTT ACG TCC ATC TTT AAC CCT GGG TCG GCT CAG 7 8 

AGC ATC CAG CTC ATA AAC 96 
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(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 6 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

5 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DKl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

,^ ACC ACC CAC GTG ACG GGG GCG GTG CAG GGC CGC ACC ACC 
CAA GGT TTC GCG TCC CTC TTC TCA CCC GGA TCG GCC CAG 
AAA ATC CAG CTT GTA AAC 



(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

15 (A) LENGTH: 9 6 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
2Q (C) INDIVIDUAli ISOLATE: US 10 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

GCA ACC AGG ACG GTT GGG CAT TCT GCA GCG TAG ACC GCC 
TCC ACT TTC GCC GGC ATC TTC AAC GCT GGC TCT AGG CAG 
AAC ATC CAG CTC ATC AAC 

25 

(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 6 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGTU^ISM: homosapiens 
(C) INDIVIDUAL ISOLATE: T4 
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SEQUENCE DESCRIPTION: SEQ ID NO: 27: 



AGC TCC ACC ACC ATT GGG AGT OCT GTC GCG AGC ACC ACC 
AGA GGC CTC ACC GGC TTG TTC TCC CCA GGC TCT CAG CAG 
AAC ATC CAG CTC ATT AAC 



(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 6 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: T9 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 

ACC ACC CAT ACA TCT GGG GGC ACC GCC GGG CAT ACA GCC 
TAT GGC CTC ACC AGC ATC TTC AGC CCT GGC GCC CGG CAG 
AAA ATC CAG CTC ATT TAT 



(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 96 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: T2 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

CAC ACC GAG CTC ACC GGG AGT AAT GCC GGG CGT ACC ACC 
CAG GGC CTC GCT GCC TTC TTC ACC CCT GGC GCT AGC CAG 
AGG GTT CAG CTC ATT AAC 



(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 6 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: T8 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

ACC ACC TAT ACT ACC GGC GCA CAA GTG GCT CGT ACC ACT 3 9 

GCT AGT CTT GCC GGC CTC TTC ACC ACC GGT CCT CAG CAG 78 
AAA ATC AAC TTA ATC AAT 9 6 



(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 96 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

15 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: DK8 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 

GCC ACT TAT ACC ACC GGC GGA CAA GCG GCT AGG GAC ACC 39 
TGG GGG CTT GCT CGC CTC TTC TCC CCT GGC GCC CAG CAG 78 
AAA CTC AGT TTG ATC AAC 9 6 



(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 6 base pairs 

(B) TYPE: nucleic acid 
25 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: DKll 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

30 

AAC ACC CGT GTC ACC GGC GCG ATC GCG GGT CGG ACC GCC 39 
GCA TCG CTT GCT AGC CTC TTT AAC TCT GGC CCC CAG CAG 78 
AAA ATC AAT TTG ATC AAC 96 



35 



BNSDOCtD: <WO_J»W7B4AZJL> 



wo 96/40764 PCT/US96/09340 



- 48 - 

O 

(2) INFORMATION FOR SEQ ID NO: 33: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 96 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANDEDNESS : s ingl e 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S83 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

ACC ACT TAT ACC ACT GGA GCA TCT GCT GGA CAG CAG GTA 39 

CAG AGC TTC GCC AGA CTC TTC AGT CCG GGG CCC AAC CAG 78 

CAT GTC CAG CTC GTC CGC 96 



(2) INFORMATION FOR SEQ ID NO: 34: 



(i) SEQUENCE CHARACTERISTICS: 

15 (A) LENGTH: 96 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



ORIGINAL SOURCE: 
(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: HKIO 

SEQUENCE DESCRIPTION: SEQ ID NO: 34: 

GGG ACA TAT ATC AGT GGT GGC CAC GTG GCT CGT GGT GCC 3 9 

TCG GGG CTC GCC AGC TTT TTT TCT CCG GGC GCC AAA CAG 78 
AAC CTG CAG CTG ATC AAT 9 6 



20 



(vi) 



(xi) 



25 

(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 96 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY": linear 

30 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S2 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 

GAA ACA TAT GTC ACC GGT GGC ACT GCA OCT CGT AGT GCT 39 
AGT AGG CTA GCT AGC TTC TTT TCT CCG GGC GCC CAG CAG 78 
AAA CTG CAG CTG GTT AAC 9 6 

5 (2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 96 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S52 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 

GAA ACA TAT GTC ACC GGT GGC AGT GTA GCT CAT AGT GCT 3 9 

15 AGA GGG TTA ACT AGC CTT TTT AGT ATG GGC GCC AAG CAG 78 

AAA CTG CAG TTG GTC AAC 96 



20 



(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 6 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S54 

25 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 

GCA ACA TAT ACC ACC GGT GGC AGT GCA GCT CAT AGT GCC 39 
CAA GGG ATA ACT CGC CTT TTT AGT GTG GGC GCC AAA CAG 78 
AAC CTG CAG TTG GTC AAC 96 

30 (2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 96 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANDEDNESS : s ingl e 

(D) TOPOLOGY: linear 

35 
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(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DK12 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 

ACC ACA CAC GTC ACC GGT GGC GAT GCA GCT CGT AGT ACC 3 9 

CTC AGG TTT ACT AGC CTT TTT AGT GTG GGC TCC AAC CAG 78 
CAA CTG CAG CTA GTC AAC 9 6 



10 



(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 96 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

15 {A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: Z4 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:39: 

CAC ACA TCT GTC AGC GGG GGC ACT CAG GCC CGA GCA GCC 39 
CAA GGG TTG ACC AGC CTC TTT ACA TCT GGG CCC AGA CAA 78 
AAC CTC CAG CTG ATA AAT 96 



20 



30 



(2) INFORMATION FOR SEQ ID NO: 40: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 6 base pairs 

(B) TYPE: nucleic acid 
25 (C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: Zl 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 

ACC ACG TAC GCT TCT GGC GCT GCG GCC GGC CGA ACC ACC 3 9 

TCT GGC TTT GCC GGC CTA TTT ACC CCT GGC GCC AAG CAG 78 

x^AC ATC CGG CTT ATC AAC 9 6 
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(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 96 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

5 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: Z7 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 

ACG ACC ATG ACA ACC GGG GGA GCT GCT GCC CGC ACT GCC 
CAC GCC TTC ACC GGC CTT TTC ACT TCT GGG CCC CAG CAA 
AAA TTA CAG CTC ATT AAC 



(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

15 (A) LENGTH: 96 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
2Q (C) INDIVIDUAL ISOLATE: Z6 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 

GAG ACC GTG ACA ACT GGG GGA AGC GTT GCT CGC AGC ACC 
CGG GCC ATT ACT AGC CTC TTC AAT TCT GGG CCT AAG CAG 
AAC CTA CAG CTC ATT AAT 

25 

(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 96 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

30 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM; homosapiens 
(C) INDIVIDUAL ISOLATE: DK13 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:43: 

GGC ACC TAG GTC ACC GGG GGC GAG GCG GGA GAG ACC GCG 
TTT GAC CTT ACC GGA CTG TTC ACC AGG GGT TCC CAC CAG 
AAC ATA CAG CTC ATT AAC 



(2) INFORMATION FOR SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 96 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SAG 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 

AGC ACC CAC AGT GTG GGG GGC TCT GCA GCT CAT ACT ACG 
AGC GGC TTT ACC TCA CTT TTC AAC CCC GGG CCG AAG CAG 
AAC TTG CAG CTC ATA TAC 



(2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 96 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SAl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 

CGC ACC CAC ACC GTG GCC GGT ACC GCT GCT TAC AGT ACG 
CGA GGC TTT GCC TCG ATT TTC ACC CCC GGG CCA AAG CAG 
AAC TTG CAG CTC ATA AAT 



(2) INFORMATION FOR SEQ ID NO: 46: 

(i) f7EQUENCE CHARACTERISTICS: 

(A) LENGTH: 96 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANDEDNESS : s ingl e 

(D) TOPOLOGY: linear 
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(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SA13 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 

5 

AAC ACC CGC ACT GTG GGT GGT AGT GCG GCC CAA GGC GCG 3 9 

CGC GGG CTC GCT TCA CTT TTC ACC CCT GGG CCG CAG CAG 78 

AAC TTG CAG CTC ATA AAT 9 6 

(2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 96 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

15 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: SA4 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 

AAC ACC CAC ATT TCG GGC GGT ACT GCT GCT AAA ACT GTG 39 
CAA GGT TTT ACT TCA CTT TTC TCC TTC GGG GCA CAG CAG 7 8 

AAT TTG CAG CTC ATA AAT 9 6 

(2) INFORMATION FOR SEQ ID NO: 48: 



20 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 6 base pairs 

(B) TYPE: nucleic acid 
25 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vi) ORIGINTU^ SOURCE: 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: SA7 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:48: 

AAC ACT CAC GTT GTG GGC GGT GCC GCT GCT CGT AGT GCG 39 
AGT GGC ATG GCC TCA CTC TTT ACT GTC GGG GCA AAG CAG 78 
AAT TTG CAG CTC ATA AAT 9 6 
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(2) INFORMATION FOR SEQ ID NO: 49: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 93 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

^ (vi) ORIGINAL SOURCE: 

/ (A) ORGANISM: homo sapiens 

' (C) INDIVIDUAL ISOLATE: HK2 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 

ACC ACC ACC ACC GGC CAC GCA GTG GGC CGC ACA ACC TCC 3 9 

AGC CTT GCC GGG CTT TTC TCC CCC GGT GCC AAG CAA AAT 78 
CTA CAA CTT ATC AAC qri 



(2) INFORMATION FOR SEQ ID NO: 50: 

(i) SEQUENCE CHARACTERISTICS: 

15 (A) LENGTH: 32 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SI 8 



20 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50 



Asp Thr Tyr Ala Thr Gly Gly Ser Ala Ser Arg Thr 

1 5 10 

Thr Gin Ala Phe Thr Arg Phe Phe Ser Pro Gly Ala 
15 20 
25 Lys Gin Asp lie Gin Leu lie Asn 
25 30 



(2) INFORMATION FOR SEQ ID NO: 51: 

SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 amino acids 

(B) TYPE: amino acid 
{ C ) S TRANDEDNES S : unknown 
( D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE- 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: S14 

35 



(i) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 

Asp Thr Tyr lie Thr Gly Gly Thr Ala Gly Arg Thr 

^ ' ' ' • 5 ^ ^ ^ * 1J3 " ^ v__ 

Val Gl^^ Thr Leu. Sejy A%ni Leu jfexxJAla) Pro Gly Ala 
5 ' ^ 15 ^ 20 ^ 

Lys Gin Asn lie Gin Leu lie Asn 
25 I 30 



(2) INFORjy[ATION FOR SEQ ID NO: 52: 

.r. (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

( D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
15 (C) INDIVIDUAL ISOLATE: DK7 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52 

Ser Thr His Val Thr Gly Gly Thr Ala Ala Arg Ala 

1 . 5 ' ^ ^1-0- 



20 



30 



Ala Phe Gly lie Thr Ser Leu Phe Ala-^ro Gly Ala 

/Lys Gin Asn'-jlle Gin Leu lie Ser 

^23—^ ..--'^ 3 0 




(2) INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 

25 (A) LENGTH: 32 amino acids 

(B) TYPE: amino acid 

( C) STRANDEDNESS : unknown 

( D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

(A) 0RG7\NISM: homosapiens 
(C) INDIVIDUAL ISOLATE: US 11 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 

Glu Thr Tyr Val Thr Gly Gly Ser Ala Gly His Ala 

15 10 
Ala Ser Gly Leu Ala Gly Leu Phe Ser Gin Gly Ala 
15 20 



35 
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Gin Gin Asn He Gin Leu He Asn 
25 30 



10 



(2) INFORMATION FOR SEQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

( D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homos api ens 
(C) INDIVIDUAL ISOLATE: SWl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54 



Glu Thr Tyr Thr Thr Gly Gly Ala Ala Gly Gin Thr 

15 10 
Ala Ser Gly Phe Thr Ser Leu Phe Thr Arg Gly Ala 
15 20 
15 Gin Gin Asn lie Gin Leu Val Asn 
25 30 



20 



25 



(2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 amino acids 

(B) TYPE: amino acid 

( C ) S TRANDEDNES S : unknown 

( D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DK9 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55 



Asp Thr Arg Val Thr Gly Gly Ser Ala Ala Arg Asn 

15 10 
Thr Tyr Gly Leu Ala Ser Leu Leu Ser Pro Gly Ala 

15 20 
Lys Gin Asn lie Gin Leu lie Asn 
30 25 30 



(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 amino acids 
35 (B) TYPE: amino acid 
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57 



(C) STRANDEDNESS : unknown 

( D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DR4 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 

Gly Thr Gin Val Ser Gly Gly Ser Ala Ala Arg Thr 

15 10 
Val Asn Ala Leu Ala Gly Leu Phe Asp Gin Gly Ala 

15 20 
Arg Gin Asn lie Gin Leu lie Asn 
25 30 



(2) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 amino acids 
15 (B) TYPE: amino acid 

( C) STRT^EDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DRl 



20 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 



Thr Thr His Val Thr Gly Gly Ser Glu Ala Arg Ala 

15 10 
Ala Ser Ala Leu Thr Gly Leu Phe Thr Arg Gly Ala 

15 20 
Arg Gin Asn Val Gin Leu lie Asn 
25 25 30 



(2) INFORMATION FOR SEQ ID NO: 58: 

SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 6 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: D3 

35 



(i) 



30 
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(xi) SEQUENCE 

Arg Gly Gly Val Gly Thr 

1 5 

Gin Ala Tyr Ser Val Arg 
15 

Ser Thr Gly Pro Ala Gin 

5 25 30 



- 58 - 

DESCRIPTION: SEQ ID NO: 58: 

His Thr He Gly Gly Ala 
10 

Gly Phe Thr Ser He Phe 
20 

Lys lie Gin Leu Val Asn 

35 



(2) INFORMATION FOR SEQ ID NO: 59: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 6 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

( D ) TOPOLOGY : unknown 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: Dl 

15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 



Ser Ala Ser Pro 
1 

Gin Ala Lys His 
15 

Ser Leu Gly Pro 



Gly Thr Arg Thr 
5 

Thr Ser Ser lie 
20 

Ser Gin Lys lie 
30 



lie Gly Gly Ser 
10 

Val Ser Met Phe 

Gin Leu Val Asn 
35 



(2) INFORMATION FOR SEQ ID NO: 60: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 amino acids 

(B) TYPE: amino acid 

25 (C) STRANDEDNESS: unknown 

{ D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: PIO 



30 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 

Arg Thr His Thr Thr Gly Gly Ser Val Ala Tyr Gly 

15 10 
Thr Arg Arg Phe Thr Ser Leu Phe Thr Ser Gly Ala 

15 ZO 
Ser Gin Lys lie Gin Leu Val Asn 
25 30 



35 
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10 



(2) INFORMATION FOR SEQ ID NO: 61: 

(i) * SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 amino acids 

(B) TYPE: cimino acid 

( C ) STRANDEDNESS : unknown 

( D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: TIO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61 



Ser Thr Arg Val Thr Gly Gly Thr Ala Ala Arg Asn 

15 10 

Thr Tyr Gly Leu Ala Ser lie Phe Ala Pro Gly Ala 

15 20 

Ser Gin Lys lie Gin Leu lie Asn 

15 25 30 



20 



(2) INFORMATION FOR SEQ ID NO: 62: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

( D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: HK5 

25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: 

Ala Thr His Val Thr Gly Gly Thr Ala Ala His Thr 

15 10 
Thr Arg Gly Leu Thr Ser Leu Phe Ala Pro Gly Pro 

15 20 
Ser Gin Lys lie Gin Leu lie Asn 
25 30 



30 



(2) INFORMATION FOR SEQ ID NO: 63 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 amino acids 

(B) TYPE: amino acid 

35 (C) STR.ANDEDNESS : unknown 
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( D ) TOPOLOGY : unknown 

(vi) ORIGINTU^ SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: HK8 

5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: 

Asp Thr Tyr Val Ser Gly Gly Ala Thr Ala Arg Asn 

15 10 
Thr Tyr Gly Leu Thr Ser Leu Phe Thr Pro Gly Ala 

15 20 
Ala Gin Lys lie Gin Leu lie Asn 



(2) INFORMATION FOR SEQ ID NO: 64: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 amino acids 

(B) TYPE: amino acid 

15 (C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: T3 



20 



25 



30 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64 

Thr Thr His Val Ser Gly Gly Val Ser Ala Arg Thr 

15 10 
Thr His Gly Leu Ala Ser Phe Phe Ser Pro Gly Pro 

15 20 
Ser Gin Lys lie Gin Leu Val Asn 
25 30 



(2) INFORMATION FOR SEQ ID NO: 65: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SW2 



35 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65 

Asn Thr Tyr Thr Thr Gly Gly Glu Ala Ala Tyr Asn 

15 10 
Thr Arg Gly Phe Ala Ser lie Phe Ser Ser Gly Pro 

15 20 
Ser Gin Lys lie Gin Leu Val Asn 
25 30 



(2) INFORMATION FOR SEQ ID NO: 66: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

( D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SAIO 

15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66 

Gly Thr Tyr Thr Thr Gly Gly Ala Gin Gly Arg Thr 

15 10 
Thr Ser Ser Phe Val Gly Leu Phe Thr Pro Gly Pro 

15 20 
Ser Gin Arg lie Gin Leu Val Asn 
25 30 



10 



20 



30 



(2) INFORMATION FOR SEQ ID NO: 67; 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 amino acids 

(B) TYPE: amino acid 

25 (C) STRANDEDNESS : unknown 

( D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: US6 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67; 

Glu Thr His Val Thr Gly Glv Ala Gin Ala Tyr Ala 

1 5 -10 

Ala Arg Ser Phe Thr Ser Leu Phe Thr Pro Gly Ser 

15 20 

Arg Gin Asn lie Gin Leu lie Asn 
25 30 



35 
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(2) INFORMATION FOR SEQ ID NO: 68: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: IND5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68 



Gin Ala Lys Thr He Gly Gly Arg Gin Ala His Thr 

1 5 10 

Thr Gly Arg Leu Val Ser Met Phe Thr Pro Gly Pro 

15 20 

Ser Gin Asn He Gin Leu Val Asn 
15 25 30 



(2) INFORMATION FOR SEQ ID NO: 69: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

( D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: IND8 

25 (^i) SEQUENCE DESCRIPTION: SEQ ID NO: 69 

His Thr Asn lie lie Gly Gly Arg Glu Ala Ser Thr 

15 10 
Thr Gin Gly Phe Thr Ser Leu Phe Ser Pro Gly Ala 

15 20 
Ser Gin Lys lie Gin Leu Val Asn 
25 30 



30 



(2) INFORMATION FOR SEQ ID NO: 70 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 amino acids 

(B) TYPE: amino acid 

35 (C) STRANDEDNESS: unknown 
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( D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: HK3 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70; 

Ser Thr His Thr lie Gly Ala Thr Val Ala Arg Thr 

15 10 
Thr Gin Ser Trp Thr Gly Phe Phe Ser Ser Gly Pro 

15 20 
Ser Gin Lys lie Gin Leu lie Asn 
25 30 



(2) INFORMATION FOR SEQ ID NO: 71: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 amino acids 

(B) TYPE: amino acid 

15 ( C ) STRANDEDNESS : unknown 

( D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: S9 



20 



25 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71 

Gly Thr Thr Val Thr Gly Ala Val Gin Gly Arg Ser 

15 10 
Leu Gin Gly Leu Thr Gly Leu Phe Ser Ser Gly Pro 

15 20 
Thr Gin Lys Leu Gin Leu Val Asn 
25 30 



(2) INFORMATION FOR SEQ ID NO: 72: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 
30 (D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: HK4 



35 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72 

Asn Thr Tyr Val Thr Gly Gly Ala Ala Ser His- Ser 

15 10 
Thr Arg Gly Leu Thr Ser Leu Phe Thr Thr Gly Ala 

15 20 
Ser Gin Lys lie Gin Leu lie Asn 
25 30 



(2) INFORMATION FOR SEQ ID NO: 73: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 
{ D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S45 

15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73 

Gly Thr Tyr Thr Ser Gly Gin Ala Ala Gly Arg Thr 

15 10 

Thr Ala Gly Phe Thr Ser lie Phe Asn Pro Gly Ser 

15 20 

Ala Gin Ser lie Gin Leu lie Asn 

20 " 



10 



(2) INFORMATION FOR SEQ ID NO: 74: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 amino acids 

(B) TYPE: amino acid 

25 ( C ) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: DKl 



30 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74: 

Thr Thr His Val Thr Gly Ala Val Gin Gly Arg Thr 

15 10 
Thr Gin Gly Phe Ala Ser Leu Phe Ser Pro Gly Ser 

15 20 
Ala Gin Lys lie Gin Leu Val Asn 
25 30 



35 
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(2) INFORMATION FOR SEQ ID NO: 75: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 amino acids 

(B) TYPE: amino acid 

{ C ) STRANDEDNESS : unknown 
( D ) TOPOLOGY : unknown 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: US 10 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75: 

Ala Thr Arg Thr Val Gly His Ser Ala Ala Tyr Thr 

15 10 

Ala Ser Thr Phe Ala Gly lie Phe Asn Ala Gly Ser 

15 20 

Arg Gin Asn lie Gin Leu lie Asn 
15 25 30 



(2) INFORMATION FOR SEQ ID NO: 76: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: T4 



25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76; 

Ser Ser Thr Thr lie Gly Ser Ala Val Ala Ser Thr 

15 10 
Thr Arg Gly Leu Thr Gly Leu Phe Ser Pro Gly Ser 

15 20 
Gin Gin Asn lie Gin Leu lie Asn 
25 30 

30 



(2) INFORMATION FOR SEQ ID NO: 77: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 
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66 



(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homos api ens 
(C) INDIVIDUAL ISOLATE: T9 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77 

Thr Thr His Thr Ser Gly Gly Thr Ala Gly His Thr 

1 5 10 

Ala Tyr Gly Leu Thr Ser lie Phe Ser Pro Gly Ala 

15 20 
Arg Gin Lys lie Gin Leu lie Tyr 
25 30 



(2) INFORMATION FOR SEQ ID NO: 78: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 amino acids 

(B) TYPE: amino acid 

15 (C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: T2 



20 



25 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 78: 

His Thr Glu Leu Thr Gly Ser Asn Ala Gly Arg Thr 

15 10 
Thr Gin Gly Leu Ala Ala Phe Phe Thr Pro Gly Ala 

15 20 
Ser Gin Arg Val Gin Leu lie Asn 
25 30 



(2) INFORMATION FOR SEQ ID NO: 79: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 
30 (D) TOPOLOGY: unknown 

(vi) CRIGINAL SOURCE: 

(A; ORGANISM: homosapiens 
(C^ INDIVIDUAL ISOLATE: T8 



35 
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(xi) SEQUENCE 

Thr Thr Tyr Thr Thr Gly 

1 5 
Thr Ala Ser Leu Ala Gly 
15 

Gin Gin Lys lie Asn Leu 
5 25 30 



- 67 - 

DESCRIPTION: SEQ ID NO: 79: 

Ala Gin Val Ala Arg Thr 
10 

Leu Phe Thr Thr Gly Pro 
20 

lie Asn 



10 



(2) INFORMATION FOR SEQ ID NO: 80: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

( D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DK8 

15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80: 

Ala Thr Tyr Thr Thr Gly Gly Gin Ala Ala Arg Asp 

15 10 
Thr Trp Gly Leu Ala Arg Leu Phe Ser Pro Gly Ala 

15 20 
Gin Gin Lys Leu Ser Leu lie Asn 
25 30 

20 



(2) INFORMATION FOR SEQ ID NO: 81: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 amino acids 

(B) ■ TYPE: amino acid 

25 (C) STRANDEDNESS: unknown 

( D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: DKll 



30 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81 

Asn Thr Arg Val Thr Gly Ala lie Ala Gly Arg Thr 

15 10 

Ala Ala Ser Leu Ala Ser Leu Phe Asn Ser Gly Pro 

15 20 

Gin Gin Lys lie Asn Leu lie Asn 
25 30 



35 
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(2) INFORMATION FOR SEQ ID NO: 82: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

( D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S83 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 82: 



Thr Thr Tyr Thr Thr Gly Ala Ser Ala Gly Gin Gin 

15 10 
Val Gin Ser Phe Ala Arg Leu Phe Ser Pro Gly Pro 

15 20 
Asn Gin His Val Gin Leu Val Arg 
15 25 30 



(2) INFORMATION FOR SEQ ID NO: 83: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAIj SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: HKIO 

25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 83 

Gly Thr Tyr lie Ser Gly Gly His Val Ala Arg Gly 

15 10 

Ala Ser Gly Leu Ala Ser Phe Phe Ser Pro Gly Ala 

15 20 

Lys Gin Asn Leu Gin Leu lie Asn 

25 30 



30 



(2) INFORMATION FOR SEQ ID NO: 84 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 amiiiO acids 

(B) TYPE: amino acid 

35 (C) STRANDEDNESS: unknovm 



BN80OCID: <¥K)__J96«764A2JLj> 



wo 96/40764 PCT/US96/09340 



69 



( D ) TOPOLOGY : unknovm 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S2 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 84 

Glu Thr Tyr Val Thr Gly Gly Ser Ala Ala Arg Ser 

15 10 
Ala Ser Arg Leu Ala Ser Phe Phe Ser Pro Gly Ala 

15 20 
Gin Gin Lys Leu Gin Leu Val Asn 
25 30 



10 



20 



25 



30 



(2) INFORMATION FOR SEQ ID NO: 85 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 amino acids 

(B) TYPE: amino acid 

15 (C) STRANDEDNESS : unknown 

{ D ) TOPOLOGY : unknovm 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: S52 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 85 

Glu Thr Tyr Val Thr Gly Gly Ser Val Ala His Ser 

15 10 
Ala Arg Gly Leu Thr Ser Leu Phe Ser Met Gly Ala 

15 20 
Lys Gin Lys Leu Gin Leu Val Asn 
25 30 



(2) INFORMATION FOR SEQ ID NO: 86: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNES S : unknown 

( D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S54 



35 
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(xi) SEQUENCE 

Ala Thr Tyr Thr Thr Gly 

1 5 
Ala Gin Gly lie Thr Arg 
15 

Lys Gin Asn Leu Gin Leu 
5 25 30 



- 70 - 

DESCRIPTION: SEQ ID NO: 86: 

Gly Ser Ala Ala His Ser 
10 

Leu Phe Ser Val Gly Ala 
20 

Val Asn 



10 



(2) INFORMATION FOR SEQ ID NO: 87: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homo sapiens 
(C) INDIVIDUAL ISOLATE: DK12 

15 (^i) SEQUENCE DESCRIPTION: SEQ ID NO: 87: 

Thr Thr His Val Thr Gly Gly Asp Ala Ala Arg Ser 

15 10 
Thr Leu Arg Phe Thr Ser Leu Phe Ser Val Gly Ser 

15 20 
Asn Gin Gin Leu Gin Leu Val Asn 

20 " " 



(2) INFORMATION FOR SEQ ID NO: 88: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 amino acids 

(B) TYPE: amino acid 

25 (C) STRANDEDNESS: unknown 

( D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: Z4 



30 



. (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 88; 

His Thr Ser Val Ser Gly Gly Thr Gin Ala Arg Ala 

1 5 10 

Ala Gin Gly Leu Thr Ser Leu Phe Thr Ser Gly 'hrro 

15 20 

Arg Gin Asn Leu Gin Leu lie Asn 

25 30 



35 
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(2) INFORMATION FOR SEQ ID NO: 89: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: Zl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 89 



Thr Thr Tyr Ala Ser Gly Ala Ala Ala Gly Arg Thr 

15 10 
Thr Ser Gly Phe Ala Gly Leu Phe Thr Pro Gly Ala 

15 20 
Lys Gin Asn lie Arg Leu lie Asn 
15 25 30 



20 



(2) INFORMATION FOR SEQ ID NO: 90: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: Z7 

25 (^i) SEQUENCE DESCRIPTION: SEQ ID NO: 90 

Thr Thr Met Thr Thr Gly Gly Ala Ala Ala Arg Thr 

15 10 
Ala His Ala Phe Thr Gly Leu Phe Thr Ser Gly Pro 

15 20 
Gin Gin Lys Leu Gin Leu lie Asn 
25 30 



30 



(2) INFORMATION FOR SEQ ID NO: 91: 



(i/ SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 amino acids 

(B) TYPE: amino acid 

35 (C) STRANDEDNESS: unknown 
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(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: Z6 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 91 

Glu Thr Val Thr Thr Gly Gly Ser Val Ala Arg Ser 

15 10 
Thr Arg Ala lie Thr Ser Leu Phe Asn Ser Gly Pro 

15 20 
Lys Gin Asn Leu Gin Leu lie Asn 
25 30 



(2) INFORMATION FOR SEQ ID NO: 92: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 amino acids 

(B) TYPE: amino acid 

15 (C) STRANDEDNESS: unknown 

( D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: DK13 



20 



25 



30 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 92: 

Gly Thr Tyr Val Thr Gly Gly Gin Ala Gly Gin Thr 

15 10 
Ala Phe His Leu Thr Gly Leu Phe Thr Arg Gly Ser 

15 20 
His Gin Asn lie Gin Leu lie Asn 
25 30 



(2) INFORMATION FOR SEQ ID NO: 93: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

( D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SA6 



35 
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(xi) SEQUENCE 

Ser Thr His Ser Val Gly 

1 5 

Thr Ser Gly Phe Thr Ser 
15 

Lys Gin Asn Leu Gin Leu 

5 25 30 



- 73 - 

DESCRIPTION: SEQ ID NO: 93: 

Gly Ser Ala Ala His Thr 
10 

Leu Phe Asn Pro Gly Pro 
20 

lie Tyr 



10 



(2) INFORMATION FOR SEQ ID NO: 94: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

( D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SAl 

15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 94: 

Arg Thr His Thr Val Ala Gly Thr Ala Ala Tyr Ser 

1 5 10 

Thr Arg Gly Phe Ala Ser lie Phe Thr Pro Gly Pro 

15 20 
Lys Gin Asn Leu Gin Leu lie Asn 

20 ^° 



(2) INFORMATION FOR SEQ ID NO: 95: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 amino acids 

(B) TYPE: cimino acid 

25 (C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: SA13 



30 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 95; 

Asn Thr Arg Thr Val Gly Gly Ser Ala Ala Gin Gly 

15 10 

Ala Arg Gly Leu Ala Ser Leu Phe Thr Pro Gly Pro 

15 20 

Gin Gin Asn Leu Gin Leu lie Asn 
25 30 



35 
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o 



10 



(2) INFORMATION FOR SEQ ID NO: 96: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SA4 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 96: 



Asn Thr His lie Ser Gly Gly Thr Ala Ala Lys Thr 

15 10 
Val Gin Gly Phe Thr Ser Leu Phe Ser Phe Gly Ala 

15 20 
Gin Gin Asn Leu Gin Leu lie Asn 
15 25 30 



20 



(2) INFORMATION FOR SEQ ID NO: 97: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SA7 

25 (xi) SEQUENCE DESCRIPTION: SEQ ID N0:97: 

Asn Thr His Val Val Gly Gly Ala Ala Ala Arg Ser 

15 10 

Ala Ser Gly Met Ala Ser Leu Phe Thr Val Gly Ala 

15 20 

Lys Gin Asn Leu Gin Leu lie Asn 
25 30 



30 



(2) INFORMATION FOR SEQ ID NO: 98: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 amino acids 

(B) TYPE: amino acid 

35 (C) STRANDEDNESS: unknown 
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( D ) TOPOLOGY : unknown 

ORIGINAL SOURCE: 
(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: HK2 

SEQUENCE DESCRIPTION: SEQ ID NO: 98: 

Thr Gly His Ala Val Gly Arg Thr Thr 

5 10 
Ala Gly Leu Phe Ser Pro Gly Ala Lys 

20 

Gin Leu lie Asn 
30 



15 



20 



25 



30 



35 



10 



(vi) 



(Xi) 

Thr Thr Thr 
1 

Ser Ser Leu 
15 

Gin Asn Leu 
25 
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Claims 

1. A purified and isolated HVRl nucleic acid 
having a sequence of at least 15 nucleotides selected from 
the group consisting of SEQ ID N0:1 through SEQ ID NO: 49 
or a variant thereof . 

2. A purified and isolated nucleic acid 
sequence coding for a protein having at least six 
contiguous amino acids contained in a sequence selected 
from the group consisting of SEQ ID NO: 50 through SEQ ID 
NO: 98. 



3 . A purified and isolated protein having at 
least six contiguous amino acids contained in a sequence 
selected from the group consisting of SEQ ID NO: 50 through 
SEQ ID NO:98 . 



4 . An expression vector comprising a nucleic 
acid having a sequence of at least 15 nucleotides selected 
from the group consisting of SEQ ID NO : 1 through SEQ ID 
NO:49 . 



5. A host organism transformed or transfected 
with a recombinant expression vector according to claim 4 . 

6. An HVRl protein produced by the host 
organism of claim 5. 



7. A composition comprising at least one 
protein of claim 3 and an excipient, diluent or carrier. 

30 

8. A composition comprising at least one 
expression vector according to claim 4. 

9- A method of preventing hepatitis C, 
35 comprising administering the composition of claim 7 to a 
mainnal in an amount effective to stimulate the production 
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of protective antibody. 

10. A method of preventing hepatitis C, 
comprising administering the composition of claim 8 to a 
mammal in an amount effective to stimulate the production 
of protective antibody. 



10 



15 



20 



11. A vaccine for immunizing a mammal against 
hepatitis C comprising at least one protein according to 
claim 3 in a pharmacologically acceptable carrier. 

12. A vaccine for immunizing a mammal against 
hepatitis C comprising at least one expression vector 
according to claim 4 . 

13. Anti-HVRl antibodies having specific 
binding affinity for an HVRl amino acid sequence shown in 
SEQ ID NOs 50-98 or a fragment thereof. 

14. A method of preventing hepatitis C 
comprising administering the antibodies of claim 13 to a 
mammal in an amount effective to protect said mammal from 
challenge with HCV. 



25 



30 



35 
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FIGURE lA 

Aligxuaent of HVR (nt) of HCV isolates of subtype la (I) . 

SEP ID NO Isolate 

1 SI 8 1 GACACCTACgcCACtGGGGGgAgTGCCaGcaGgACCacGcaGgCgtTCActAggtTCtTCt 

2 S14 1 GACACCTACaTCACCGGGGGAACTGCCGGtCGCACCGtGggGaCacTCAgcAaTCTCcTCG 

3 DK7 1 agCACCcACGTCACCGGGGGAACTGCCGcCCGCGCtGCGTtTGGcaTTACTAGTCTCTTtG 

4 US 11 1 GAAACCTACGTCACCGGGGGAAgTGCCGGCCAtGCCGCGTCTGGAcTTgCTgGTCTTTTCt 

5 SWl 1 GAAACCTACacCACCGGGGGGgcTGCTGGtCAGACCGCGTCTGGAtTCaCCAGTCTTTTCA 

6 DK9 1 GACACCCgCGTCACCGGGGGGAGCGCTGCcaGGAaCaCGTATGGACTCGCCAGTCTTcTCA 

7 DR4 1 GgCACCCAaGTCAgCGGGGGGAGCGCcGCTCGCACCGtGaATGCACTCGCTGGTCTCTTCg 

8 DRl 1 acCACCCAtGTCActGGGGGaAGtGaaGCTCGCgCCGcGtcTGCACTCaCTGGTCTCTTCa 

1-8 consensus gacACC-acgtCAccGGGGG-agtGccg - -cgcaccgcGt - tg-acTcactagtcTctTc- 

SEO ID NO Isolate 

1 S18 62 CtCCGGGCGCCAAGCAGgACATCCAGCTaATcAAC 

2 S14 6 2 CACCGGGCGCCAAGCAGAACATCCAGCTGATtAAC 

3 DK7 6 2 CACCAGGCGCCAAaCAGAACATCCAaCTGATCAgC 

4 US 11 62 CACaAGGCGCCCAGCAGAACATCCAGCTGATCAAC 

5 SWl 62 CgCgGGGCGCCCAGCAGAATATCCAGCTGgTCAAC 

6 DK9 62 gCCcGGGCGCCaAGCAGAATATtCAGCTGATCAAC 

7 DR4 62 aCCaGGGCGCGCGGCAGAATATCCAGTTGATCAAC 

8 DRl 62 cgCgGGGCGCGCGGCAGAAcgTCCAGTTGATCAAC 

1-8 consensus caCcgGGCGCc - agCAGaAcaTcCAgcTgaTcAaC 
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FIGURE IB 



Alignment of HVR (nt) of HCV isolates of sxibtype lb (II) . 



SEP ID NO 



Isolate 



9 


D3 


1 


10 


Dl 


1 


XI 


PIO 


1 


12 


TIO 


1 


13 


HK5 


1 


14 


HK8 


1 


15 


T3 


1 


16 


SW2 


1 


17 


SAIO 


1 


18 


US 6 


1 


19 


IND5 


1 


20 


IND8 


1 


21 


HK3 


1 


22 


S9 


1 


23 


HK4 


1 


24 


S45 


1 


25 


DKl 


1 


9-25 


consensus 





cGTGgAggCgtGGGCACCCaCACGATAGGGGGGgCGCAAGCCtAcagCgtTAGggGgtTCa 
aGTGcAccCccGGGCACCCgCACGATAGGGGGGTCGCAAGCCaAacaCACTAGCAGtaTCg 
cGCACCCaCACGACgGGGGGGTCGGtgGCCtACggCACCcGCAGGtTta 
aGCACCCgCGTaACAGGGGGaACGGCAGCCCgCAaCACCtaCGGGCTCg 
GcCACCCACGTGACAGGGGGTACtGCAGCCCaCAcCACTcgtGGGCTCA 
GatACCtACGTGTCAGGGGGTGCGaCAGCCCGCAaCACTtACGGGCTtA 
Ac aACC c ACGTGTCAGGGGGgG t G t CgG C t CG CAc CACC CACGGG CTgG 
AacACCTACACGACAGGGGGaGaGgCAGCCtaCAatACCCgCGGCTTtG 
GgGACCTACACGACAGGGGGGGCGCAAGgCcgCACCACCtcCAGCTTCG 
GAGACtcACgtGACgGGGGGGGCGCAAGCCtACgCCgCCcGCAGtTTCa 
CAGgCCAAgAcAATAGGGGGGcGcCAAGCCcACACCACCgGgcGCcTTg 
CACACCAACAtAATAGGGGGGAGgGAAGCCtcCACCACCCAagGCTTTA 
aGCACCcACAcGATAGGGGCaActGtgGCCCGCACCACtCAgaGtTggA 
gGCACCacCGTGACgGGaGCGGtGcaAGGCCGTTCCctCCAAGGGCTCA 
aaCACCTACGTGACaGGGGgGGCGGCAaGCCaTTCCACCCgAGGGCTCA 
ggtACCTACacGtCGGGGcaGGCGGCGGGCCGCACCACCgccGGGTTtA 
accACCcACgtGaCGGGGgcGGtGcaGGGCCGCACCACCcaaGGtTTcg 

-gtg-a--c- -gggcaCccacatgacaGGgggggcggaagccc-caccacccgcgGgttca 



SEP ID NO Isolate 



9 


D3 


62 


cGTCCATaTTtTCAacTGGGCCGgCTCAGAAgATCCAGCTTGTAAAC 


10 


Dl 


62 


tGTCCATgTTcTCActTGGGCCGTCTCAGAAAATCCAGCTTGTAAAC 


11 


PIO 


50 


CGTCCcTCTTTaCAtCTGGGGCGTCTCAGAAAATCCAGCTTGTgAAC 


12 


TIO 


50 


CGTCCaTCTTTGCACCTGGGGCGTCTCAGAAgATCCAGCTTATAAAC 


13 


HK5 


50 


CGTCCCTgTTCGCCCCTGGGcCTTCTCAGAAAATCCAGCTTATAAAt 


14 


HK8 


50 


CGTCCCTCTTCaCCCCaGGGgCTgCTCAGAAAATCCAGCTTATAAAC 


15 


T3 


50 


CaTCCtTCTTtTCACCtGGGCCGTCTCAGAAAATCCAGCTCGTAAAC 


16 


SW2 


50 


CGaGTaTCTTCTCAagcGGGCCGTCTCAGAAAATCCAGCTCGTAAAC 


17 


SAIO 


50 


tGgGTCTCTTCACcCCTGGGCCGTCTCAGAgAATCCAGCTCGTAAAC 


18 


US 6 


50 


cGTCTCTCTTCACaCCTGGGtCacgTCAGAAtATCCAGCTTaTAAAC 


19 


IND5 


50 


tGTCTaTgTTCACCCCTGGGcCGTCCCAGAAcATCCAGCTTGTAAAC 


20 


IND8 


50 


CGaGTcTtTTCAGCCCTGGagCGTCCCAGAAAATCCAGCTTGTAAAC 


21 


HK3 


50 


CGGGCtTcTTCAGCTCcGGgCCcTCTCAGAAAATCCAGCTTaTAAAT 


22 


S9 


50 


CtGGCCTTTTttCCTCtGGaCCGaCTCAGAAAcTCCAGCTTgTAAAT 


23 


HK4 


50 


CGTCCCTTTTcACaaCgGGGgCGtCTCAGAAAATCCAGCTTATAAAC 


24 


S45 


50 


CGTCCaTCTTtAacCCtGGGTCGGCTCAGAgcATCCAGCTcATAAAC 


25 


DKl 


50 


CGTCCcTCTTctcaCCcGGaTCGGCcCAGAaaATCCAGCTtgTAAAC 


9-25 


consensus 


cgtcccTcTTcacacctGGgcCgtccCAGAaaaTCCAGCTtgTaAAc 
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FIGURE IC 

Alignment of HVR (nt) of HCV isolates of genotype 1. 

SEP ID NO Isolate 

cGTGgAggCgtGGGCACCCaCACGATAGGGGGGgCGCAAGCCtAcagCgtTAGggGgtTCa 
aGTGcAtcCccGGGCACCCgCACGATAGGGGGGTCGCAAGCCaAacaCACTAGCAGtaTCg 
cGCACCCaCACGACgGGGGGGTCGGtgGCCtACggCACCcGCAGGtTta 
aGCACCCgCGTaACAGGGGGaACGGCAGCCCgCAaCACCtaCGGGCTCg 
GcCACCCACGTGACAGGGGGTACtGCAGCCCaCAcCACTcgtGGGCTCA 
GatACCtACGTGTCAGGGGGTGCGaCAGCCCGCAaCACTtACGGGCTtA 
AcaACCcACGTGTCAGGGGGgGtGtCgGCtCGCAcCACCCACGGGCTgG 
AacACCTACACGACAGGGGGaGaGgCAGCCtaCAatACCCgCGGCTTtG 
GgGACCTACACGACAGGGGGGGCGCAAGgCcgCACCACCtcCAGCTTCG 
GAGACtcACgtGACgGGGGGGGCGCAAGCCtACgCCgCCcGCAGtTTCa 
CAGgCCAAgAcAATAGGGGGGcGcCAAGCCcACACCACCgGgcGCcTTg 
CACACCAACAtAATAGGGGGGAGgGAAGCCtcCACCACCCAagGCTTTA 
aGCACCcACAcGATAGGGGCaActGtgGCCCGCACCACtCAgaGtTggA 
gGCACCacCGTGACgGGaGCGGtGcaAGGCCGTTCCctCCAAGGGCTCA 
aACACCTACGTGACaGGGGGGGCGGCAaGCCATTCCaCCCgAGGGCTCA 
GAaACCTACacCACCGGGGGGGCtGCtGGTCAgACCGCGtcTGGAtTCA 
GGCACCCAaGTCAgCGGGGGGAgcGCCGCTCGCACCGtGaaTGcAcTCg 
aGCACCCACGTCACCGGGGGaAcTGCCGCCCGCgCtGCGttTGgcaTtA 
GaCACCTACGCCACtGGGGGGAgTGCCaGCaGgACCACGcagGcGTTcA 
GgtACCTACaCGtCGGGGcaGGcGGCGGGCCGCACCACCgccGGGTTtA 
accACCcACGTGACGGGGGcGGtGcaGGGCCGCACCACCcaaGGtTTcG 
GAaACCTACGTCACCGGGGGAAgTGCCGGCCatgCCGCGtccGGACTtG 
GACACCTACaTCACCGGGGGAAcTGCCGGtCGcACCGtGgggacACTCa 
GACACCCgCGTCACCGGGGGgAGcGCtGCcaGgAaCaCGTaTGgACTCg 
acCACCCatGTCACtGGGGGaAGtGaaGCtcGcgcCgCGTcTGcACTCa 

-gtg-a- -c- -ggacaCccacgtgacaGGgggg-cggcagcccgcaccacccacgggctca 



9 


D3 


1 


10 


Dl 


1 


11 


PIG 


1 


12 


TIO 


1 


13 


HK5 


1 


14 


HK8 


1 


15 


T3 


1 


16 


SW2 


1 


17 


SAIO 


1 


18 


US 6 


1 


19 


IND5 


1 


20 


IND8 


1 


21 


HK3 


1 


22 


S9 


1 


23 


HK4 


1 


5 


SWl 


1 


7 


DR4 


1 


3 


DK7 


1 


1 


SIS 


1 


24 


S45 


1 


25 


DKl 


1 


4 


USll 


1 


2 


S14 


1 


6 


DK9 


1 


8 


DRl 


1 


1-25 


consensus 





SEO ID 


NO Isolate 




9 


D3 


62 


cGTCCATaTTtTCAacTGGGCCGgCTCAGAAgATCCAGCTTGTAAAC 


10 


Dl 


62 


tGTCCATgTTcTCActTGGGCCGTCTCAGAAAATCCAGCTTGTAAAC 


11 


PIO 


50 


CGTCCcTCTTTaCAtCTGGGGCGTCTCAGAAAATCCAGCTTGTgAAC 


12 


TIO 


50 


CGTCCaTCTTTGCACCTGGGGCGTCTCAGAAgATCCAGCTTATAAAC 


13 


HK5 


50 


CGTCCCTgTTCGCCCCTGGGcCTTCTCAGAAAATCCAGCTTATAAAt 


14 


HK8 


50 


CGTCCCTCTTCaCCCCaGGGgCTgCTCAGAAAATCCAGCTTATAAAC 


15 


T3 


50 


CaTCCtTCTTtTCACGtGGGCGGTCTCAGAAAATCCAGCTCGTAAAC 


16 


SW2 


50 


CGaGTaTCTTCTCAagcGGGCCGTCTCAGAAAATCCAGCTCGTAAAC 


17 


SAIO 


50 


tGgGTCTCTTCACcCCTGGGCCGTCTCAGAgAATCCAGCTCGTAAAC 


18 


US 6 


50 


cGTCTCTCTTCACaCCTGGGtCacgTCAGAAtATCCAGCTTaTAAAC 


19 


IND5 


50 


tGTCTaTgTTCACCCGTGGGcCGTCCCAGAAcATCCAGCTTGTAAAC 


20 


IND8 


50 


CGaGTcTtTTCAGCCCTGGagCGTCGCAGAAAATCCAGCTTGTAAAC 


21 


HK3 


50 


CGGGCtTcTTCAGCTCcGGgCCcTCTCAGAAAATCCAGCTTaTAAAT 


22 


S9 


50 


CtGGCCTTTTttCCTCtGGaCCGaCTCAGAAAcTCCAGCTTgTAAAT 


23 


HK4 


50 


CgtcCCTTTTCACaaCGGGgGCGtCTCAGAAAATCCAGCTTaTAAAC 


5 


SWl 


50 


CcaGTCTTTTCACgCgGGGCGCcCaGCAGAATATCCAGGTGgTCAAC 


7 


DR4 


50 


CTgGTCTCTTCGacCaGGGCGCgCgGCAGAATATCCAGtTGATCAAC 


3 


DK7 


50 


CTAGTCTCTTtGCaCCaGGCGCCAAaCAGAACATCCAaCTGATCAgC 


1 


S18 


50 


CTAGgtTCTTctCtCCgGGCGCCAAgCAGgACATCCAGCTaATGAAC 


24 


S45 


50 


CGTCCaTCTTtaacCCtGGgTCGGCtCAGAgCATCCAGCTcATAAAC 


25 


DKl 


50 


CGTCCCTCTTCTCACCcGGaTCGGCcCAGAAaATCCAGCTtgTAAAC 


4 


USll 


50 


CtggTCTtTTCTCACaaGGCGCCcAGCAGAACATCCAGCTGATcAAC 


2 


S14 


50 


gCAaTCTcCTCgCACCGGGCGCCAAGCAGAACATCCAGCTGATtAAC 


6 


DK9 


50 


CCAGTCTtCTCAgcCCGGGCGCCAAGCAGAAtATtCAGCTGATCAAC 


8 


DRl 


50 


CtgGTCTctTCAcgCgGGGCGCgcgGCAGAAcgTcCAGtTGATCAAC 


1-25 


consensus 


cgt - -cTctTcacacctGGggCgtccCAGaaaaTcCAgcTtaTaAac 
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FIGURE ID 

Alignment of HVR (nt) of HCV isolates of subtype 2a (III) . 
SEP ID NO Isolate 



26 USIO 1 gcaaCCAggACggTTGGGcaTtCTGcaGCGtaCACCgCCtccacttTCgCCGGCaTcTTCa 

2 7 T4 1 AgCtCCAccACcaTTGGGaGTgCTGtCGCGagCACCaCCagaGGCCTCACCGGCtTgTTCt 

2 8 T9 1 AcCACCcAtACatCTGGGgGcACcGCCGGGCaTACagCCtAtGGCCTCACCaGCaTCTTCA 

2 9 T2 1 caCACCgAgctcaCcGGGaGtAatGCCGGGCgTACcaCCcAgGGCCTCgCtgcCtTCTTCA 

26-29 consensus accaCCaagacca- tGGGagtactGccG-Gc - -ACc-CCta-ggccTC-CcggC-TcTTCa 



SEP ID NO Isolate 

26 USIO 6 2 aCgCtGGCTCTagGCAGAACATCCAGCTCATcAAC 

2 7 T4 6 2 cCCCaGGCTCTCaGCAGAACATCCAGCTCATTAAC 

2 8 T9 6 2 gCCCTGGCGCcCGGCAGAAaATCCAGCTCATTtAt 

2 9 T2 6 2 cCCCTGGCGCtaGcCAGAgggTtCAGCTCATTaAc 

26-2 9 consensus cCcCtGGC-Ct -ggCAGAacaTcCAGCTCATtaAc 
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FIGURE IE 

Alignment of HVR (nt) of HCV isolates of subtype 2b (IV) , 

SEP ID NO Isolate 



3 0 T8 1 aCCACcTATACtACCGGCGcACAAGtGGCTcGtacCACtgctaGtCTTGCcgGCCTCTTCa 

3 1 DK8 1 gCCACtTATACCACCGGCGgACAAGCGGCTaGGgaCACCtgggGGCTTGCTcGCCTCTTCt 

32 DKll 1 aaCACccgTgtCACCGGCGcgatcGCGGgTcGGacCgCCgcatcGCTTGCTaGCCTCTTta 

3 0-32 consensus acCACctaTaccACCGGCGcacaaGcGGcTcGgacCaCcgc - -ggCTTGCc -GCCTCTTca 

SEP ID NO Isolate 

3 0 T8 6 2 CCaCcGGtcCtCAGCAGAAAaTCAacTTaATCAAt 

31 DK8 6 2 CCcCTGGCgCCCAGCAGAAAcTCAgTTTGATCAAC 

3 2 DKll 6 2 aCtCTGGCcCCCAGCAGAAAaTCAaTTTGATCAAC 

3 0-32 consensus cC- CtGGccCcCAGCAGAAAaTCAatTTgATCAAc 
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FIGURE IF 

Alignment of HVR (nt) of HCV isolates of genotype 2. 

SEP ID NO Isolate 

3 0 T8 1 aCCACcTATACtACCGGCGcACAAGtGGCTcGtacCACtgctaGtCTTGCcgGCCTCTTCa 

31 DK8 1 gCCACtTATACCACCGGCGgACAAGCGGCTaGGgaCACCtgggGGCTTGCTcGCCTCTTCt 

3 2 DKll 1 AaCACCCgTgtCACCGGCGcgAtCGCGGGTCGGACCGCCgcatcGCTTGCTAGCCTCTTtA 

2 8 T9 1 AcCACCCaTACatCTGGGGGcACCGCCGGGCatACaGCCtatGGCCTCACCAGCaTCTTCA 

27 T4 1 AgCtCCAccACcaTTGGGaGTgCTGtCGCGagCACCaCCagaGGCCTCACCGGCtTgTTCt 

26 USIO 1 gcaACCAgGACggTTGGGcaTtCTGCaGCGtaCACCgCCtccacttTCGCCGGCaTCTTCA 

2 9 T2 1 caCACCgAGctCACcGGGagTaaTGCcGGGCgtACCaCCCAGgGCcTCGCtGcCtTCTTCA 

3 3 S83 1 acCACttAtacCACtGGagcatcTGCtGGaCagcaggtaCAGaGCtTCGCcagacTCTTCA 

26-33 consensus a ccaCct at accac - GGggg - ac cGc -G - gcg -acc - cc t -gggccTcgCcggccTcTTca 

SEP ID NO Isolate 

3 0 T8 6 2 CCaCcGGtcCtCAGCAGAAAaTCAacTTaATCAAt 

31 DK8 62 CCcCTGGCgCCCAGCAGAAAcTCAgTTTGATCAAC 

3 2 DKll 62 aCtCTGGCcCCCAGCAGAAAATCAATTTGATCAAC 

28 T9 62 gCCCTGGCgCCCgGCAGAAAATCCAGCTCATTtAt 

27 T4 62 cCCCaGGCTCTCaGCAGAACATCCAGCTCATTAAC 

2 6 USIO 62 aCgCTGGCTCTAGGCAGAACATCCAGCTCATcAAC 

2 9 T2 62 cCCCTGGCgCTAGCCAGAggGTtCAGCTCATtAAC 

3 3 S8 3 62 gtCCgGGgcCcAaCCAGcatGTcCAGCTCgTccgC 

26-3 3 consensus cccCtGGc -C-cagCAGaaaaTccagcTcaTcaac 
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FIGURE IG 

Alignment of HVR (nt) of HCV isolates of siibtype 3a (V) . 

SEP ID NO Isolate 

3 4 HKIO 1 GggACATATaTCAgtGGTGGCcacGtgGCTCGTgGTGCctcggGGCTcGCcAGCTTtTTTT 

3 5 S2 1 GAAACATATGTCACCGGTGGCAGTGcAGCTCGTAGTGCTAGtaGGCTAGCTAGCTTcTTTT 

36 S52 1 GAAACATATGTCACCGGTGGCAGTGtAGCTCATAGTGCTAGAGGGtTAACTAGCCTTTTTA 

37 S54 1 GCAACATATacCACCGGTGGCAGTGCAGCTCATAGTGCCCaAGGGaTAACTcGCCTTTTTA 
3 8 DK12 1 aCcACAcAcgtCACCGGTGGCgaTGCAGCTCgTAGTaCCCtcaGGtTtACTaGCCTTTTTA 

34-38 consensus g-aACAtAtgtCAccGGTGGCagtGcaGCTCgTaGTgCc-gagGG-TaaCtaGCcTtTTTa 

SEP ID NO Isolate 



34 


HKIO 


62 


CTCCGGGCGCCaAaCAGAAcCTGCAGCTGaTcAAt 


35 


S2 


62 


CTCCGGGCGCCcAGCAGAAACTGCAGCTGGTtAAC 


36 


S52 


62 


GTaTGGGCGCCAAGCAGAAACTGCAGTTGGTCAAC 


37 


S54 


62 


GTGTGGGCG CCAAa CAGAAc CTGCAGTTGGTCAAC 


38 


DK12 


62 


GTGTGGGCtCCAAcCAGcAaCTGCAGcTaGTCAAC 


34-38 


consensus 




gT- tGGGCgCCaA- CAGaAaCTGCAGcTggTcAAc 
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FIGURE IH 

Alignment of HVR (nt) of HCV isolates of subtype 4c. 

SEP ID NO Isolate 

41 Z7 1 acGACCaTGACAACcGGGGGAgctGcTGCcCGCActgCCCacGCCtTcACcgGCCTtTTCA 

4 2 Z6 1 gaGACCgTGACAACtGGGGGAagcGtTGCtCGCAgcaCCCggGCCaTtACtaGCCTcTTCA 

41-42 consensus - -GACC- TGACAAC-GGGGGA G-TGC-CGCA CCC- -GCC-T-AC- -GCCT-TTCA 

SEP ID NO Isolate 

41 Z7 62 cTTCTGGGCCccAGCAaAAatTACAGCTCATTAAc 

42 Z6 62 aTTCTGGGCCtaAGCAgAAccTACAGCTCATTAAt 

41-42 consensus -TTCTGGGCC- -AGCA-AA- -TACAGCTCATTAA- 
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FIGURE II 

Alignment of HVR (nt) of HCV isolates of genotype 4, 

SEP ID NO Isolate 

4 3 DK13 1 ggCACcTACGtcaCcGGgGgccaGGCgGGaCagACCgCgTtTcaCcTTaCCGGaCTgTTcA 

4 0 Zl 1 acCACgTACGcttCtGGcGctgCGGCcGGCCGAACCaCCTcTGGCTTTgCCGGCCTaTTTA 

3 9 24 1 caCACaTctGtcAgCGGGGGcaCTcagGCCCGAgCaGCCCAaGGgTTgACCaGCCTcTTTA 

41 Z7 1 acGACCaTGACAACCGGGGGAgCTGcTGCCCGCACtGCCCAcGCCTTcACCgGCCTtTTCA 

42 Z6 1 gaGACCgTGACAACtGGGGGAagcGtTGCtCGCAgcaCCCggGCCaTtACtaGCCTcTTCA 

39-43 consensus - -cACct- -gc-accGGgGg- -c-gc-GccCg-accgCccatg-ctTtaCcgGcCTcTTcA 

SEP ID NO Isolate 

4 3 DK13 62 CCaggGGttCCcAcCAGAACATaCaGCTcATtAAC 

4 0 Zl 62 CCcCTGGcgCCAAgCAGAACATCCgGCTtATcAAC 

3 9 Z4 62 CaTCTGGGCCCAgaCAAAACcTCCAGCTgATaAAt 

41 Z7 62 CTTCTGGGCCCcAGCAAAAatTACAGCTCATTAAc 

42 Z6 62 aTTCTGGGCCtaAGCAgAAccTACAGCTCATTAAt 

3 9-43 1 consensus c - tctGGgcCcaagCAgAAc - TaCaGCTcATt AAc 
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FIGURE IJ 

Alignment of HVR (nt) of HCV isolates of subtype 5a. 



SEP ID NO Isolate 

44 SA6 1 aGCACCCACAgtGTGGggGGctCtGCaGCTcAtAcTACGaGcGGCTTTaCCTCacTTTTCA 

4 5 SAl 1 cGCACCCACACcGTGGccGGTACcGCtGCTtAcAGTACGCGaGGCTTTGCCTCgaTTTTCA 

46 SA13 1 AACACCCgCACTGTGGGtGGTAgTGCgGCccAAgGcGCGCGcGGgcTcGCTTCACTTTTCA 

4 7 SA4 1 AACACCCACATTtcGGGCGGTACTGCTGCTaAAAcTGtGCaaGGttTtaCTTCACTTTTCt 

4 8 SA7 1 AACACtCACgTTgtGGGCGGTgCcGCTGCTcgtAgTGcGagtGGcaTggCcTCACTcTTta 

44-48 consensus aaCACcCaCa- tgtGGgcGGtactGCtGCtca-agtgcGcg-GGctTtgCcTCacTtTTca 



SEP ID NO Isolate 



44 


SAG 


62 


aCCCCGGGCCgAAGCAGAACTTGCAGCTCATAtAc 


45 


SAl 


62 


CCCCCGGGCCaAAGCAGAACTTGCAGCTCATAAAT 


46 


SAl 3 


62 


CCCCtGGGCCgCAGCAGAACTTGCAGCTCATAAAT 


47 


SA4 


62 


CCtTCGGGGCA GAG C AG AATT TG GAG CT C AT AAAT 


48 


SA7 


62 


CtgTCGGGGCAaAGCAGAATTTGGAGCTCATAAAT 


44-48 


consensus 


cccccGGGcCaaAGCAGAAcTTGCAGCTCATAaAt 
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FIGURE IK 

Alignment of HVR (nt) of 49 HCV isolates of genotypes 1-6. 

SEP ID NO Isolate 



*± -7 




X 


ACcaccACCACcGGccacgCaGtgGGcCgcacaacc t ccAGCcTtG 


J J 


C Q 1 

o o J 


X 


aCcACt tatACCACTGGagcaTCTGCtGGaCAgcagGtacagAGCTTCG 


^ o 


TTC T n 


1 


gCaACCAgGACggTTGGGcatTCTGCAGCgtACACCGCCtccActTTCG 




TMTiC: 
XiNUD 


X 


CAggCCAAGACAATAGGGGGGcGccAAGCCcACACCACCgggcGCcTTG 


o n 


TMTi Q 


1 


CACACCAACAtAATAGGGGGGaGGGAAGCCTcCACCACCCaaGGCTTTa 


J. b 


o wz 


1 


aACACCtACACGACAGGGGGagaGGcAGCCTACAatACCCGCGGCTTTg 


i, X 


Din 
F 1 U 


1 


cGCACCcACACGACGGGGGGGtCGGtGGCCTACggCACCCGCaGGTTTA 


O A 




1 


GGtACCTACACGtCGGGGcaGGCGGcGGGCCGCACCACCgCCgGGTTTA 


X / 


c ft T n 


X 


GGGACCTACACGACaGGGGGGGCGCAAGGCCGCACCACCtCCAGcTTCg 


1 fl 
X O 


Uo o 


X 


GaGACtCACGTGACGGGGGGGGCGCAAGcCtaCgCCgCCCgCAGTTTCa 




DKX 


1 


ACcACCCACGTGACGGGGGcGGTGCAGGgCCGCACCACCCAaGGTTTCG 


X 3 


rp-j 
1 J 


X 


ACa ACCCACGTG t CAGGGGGGGTG t CGGC t CGCACCACCCACGGGCTgG 


X ^ 


i X U 


1 


AgcACCCgCGTaaCAGGGGGaaCGgCAGCCCGCAACACCTACGGGCTcG 


1 4 


HK8 


1 


gAtACCTACGTGtCAGGGGGtGCGaCAGCCCGCAACACtTACGGGCTtA 


2 3 


HK4 


1 


aACACCTACGTGACAGGGGGgGCGGCAagCCAttCCACcCGaGGGCTCA 


13 


HK5 


1 


gcCACCcACGTGACAGGGGGTACTGCAGcCCAcACCACtCGtGGGCTCA 






1 


caCACCgAgcTCACcGGGAGTAaTGCCGgGCGt ACCACCCagGGCCTCg 




T'A 


1 


AgCtCCaccACCAtTGGGAGTgCTGtCGcGaGcACCACCagaGGCCTCA 


O ft 




1 


ACCACCcAtACaTCTGGGGGcaCcGCCGGGCatACagCCTaTGGCCTCA 


4 0 


Zl 


1 


ACCACgTAcgCTTCTGGCGCtgCgGCCGGcCGaACCACCTCTGGCtTTG 


J U 


T8 


1 


ACCACCTATaCTACCGGCGCacaaGtGGcTCGtACCACtGCTaGtCTTG 




DKll 


1 


AaCACCcgTgtCACCGGCGCgatcGCGGgTCGGACCgCCGCat cGCTTG 


J 1 


DK8 


1 


GcCACtTATAcCACCGGCGGaCAaGCGGCTaGGGaCaCCTgGGGGCTTG 


3 4 


HKIO 


1 


GggACATATATCAgtGGTGGCCAcGtGGCTCGTGGTGCCTcGGGGCTcG 


3 5 


S2 


1 


GAAACATATGTCACCGGTGGCAGTGcAGCTCGTAGTGCTAGtaGGCTAG 


3 6 


S52 


1 


GAAACATATGTCACCGGTGGCAGTGtAGCTCATAGTGCTAGAGGGtTAA 


37 


S54 


1 


GCAACATATacCACCGGTGGCAGTGCAGCTCATAGTGCCCaAGGGaTAA 


3 8 


DK12 


1 


ACCACACACGTCACCGGTGGCgaTGCAGCTCGTAGTaCCCTcaGGtTTA 


3 


DK7 


1 


AgCACCCACGTCACCGGGGGAAcTGCCGCCCGcGCTGCGTTTGGcaTTA 


4 


US 11 


1 


GAAACCTACGTCACCGGGGGAAgTGCCGGCCAtGCCGCGTCTGGAcTTg 


5 


SWl 


1 


GAAACCTACacCACCGGGGGGgcTGCTGGtCAGACCGCGTCTGGAtTCa 


o 


DK9 


1 


GACACCcgCGtCACCGGGGGGAGcGCTGcCAGGAaCACGTATGGAcTCg 


1 


S18 


1 


G ACACCTACG c CAC t GGGGGGAGTGCCaG CAGG ACCACG c AGG Cg t TCA 


2 


S14 


1 


GACACCTACaTCACcGGGGGAAcTGCCGGTCGCACCGtGggGaCACTCA 


8 


DRl 


1 


acCACCCAtGTCACtGGGGGAAGTGaaGCTCGCgCCGcGtcTGCACTCA 


7 


DR4 


1 


GGCACCCAaGTCAgCGGGGGgAGcGCcGCTCGCACCGtGaaTGCACTCg 


43 


DK13 


1 


GGCACCtACGTCAcCGGGGGCcagGCgGgaCAgACCGCGttTcaCCTTA 


44 


SAG 


1 


aGCACCCACAgtGTGGGGGGCtCtGCaGCTCAtACTACGaGcGGCTTTA 


45 


SAl 


1 


cGCACCCACACcGTGGccGGTACcGCtGCTtAcAGTACGCGaGGCTTTG 


46 


SA13 


1 


aACACCCgCACtGTGGGtGGTAGtGCgGCcCAagGCgCGCGcGGgcTcG 


42 


Z6 


1 


gAGACCgTGACAACtGGGGGAAGcGtTGCtCGCAGCaCCCGgGCCaTtA 


41 


Z7 


1 


AcGACCaTGACAACcGGGGGAgCTGcTGCCCGCACtgCCCAcGCCTTcA 


21 


HK3 


1 


AGCACCcaCACGAtaGGGGCAaCTGtgGCCCGCACCaCtCAaaGtTggA 


22 


S9 


1 


gGCACCaCCGTGAcgGGaGCggtgCAaGgCCGttCCctCCAAGGGcTcA 


39 


24 


1 


cACACa t CtGTc AgcGGgGGcaC t CAgGCCCGagCaGCCCAAGGG t TGA 


48 


SA7 


1 


AACACtCACGTTgtGGGCGGTgCcGCTGCTCGCAgTGCGagtGGcaTGg 


47 


SA4 


1 


AACACCCACATTtcGGGCGGTaCtGCTGCTaAaAcTGTGcaaGGcTTtA 


9 


D3 


1 


cGTGgAggCgtGGGCACCCACACGATAGGGGGGgCGCAAGCCtAcAgCGTTAGaGGgTTCA 


10 


Dl 


1 


aGTGcAtcCccGGGCACCCgCACGATAGGGGGGtCGCAAGCCaAacaCacTAGcaGtaTCg 


1 


consensus 




-gtg-a--c- -ggacaCccaca t caccGgggggactgcagcccgcaccacccgcgggctca 
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FIGURE IK 

SEP ID NO Isolate 

4 9 HK2 4 7 CCgGgCTtTTCtccCCcGGtgCCAAgCAaaATcTaCAaCTtaTCaaC 

33 S8 3 50 CCaGaCTCTTCAgtCCgGGgcCCAAcCAGcATgTCCAGCTCgTCcgC 

26 USIO 50 CCgGcATCTTCAaCgCTGGctCtAggCAGAACATCCAGCTCaT-CAAC 

19 IND5 50 tGtcTATgTTCAcCCCTGGgcCGTCCCAGAACATCCAGCTTGTAAAC 

20 IND8 50 CGAGTcTtTTCAgCCCTGGagCGTCCCAGAAAATCCAGCTTGTAAAC 

16 SW2 50 CGAGTaTCTTCtCAagcGGGcCGTCTCAGAAAATCCAGCTcGTAAAC 

11 PIO 50 CGTCCcTCTTTACAtCTGGGgCGTCTCAGAAAATCCAGCTtGTgAAC 

24 S4 5 50 CGTCCaTCTTTAaCCCTGGGtCGgCTCAGAGcATCCAGCTCaTAAAC 

17 SAIO 50 tGggTCTCTTCACCCCTGGGcCGtCTCAGAGaATCCAGCTCgTAAAC 

18 US 6 5 0 CGTCTCTCTTCACACCTGGGTCacgTCAGAAtATCCAGCTTaTAAAC 

25 DKl 50 CGTCCCTCTTCTCACCcGGaTCGgCcCAGAAAATCCAGCTTGTAAAC 
15 T3 5 0 CaTCCtTCTTTTCACCTGGGcCGTCTCAGAAAATCCAGCTcGTAAAC 

12 TIO 50 CGTCCaTCTTTgCACCTGGGGCGTCTCAGAAgATCCAGCTTATAAAC 
14 HK8 50 CGTCCCTCTTCACcCCaGGGGCCgCTCAGAAAATCCAGCTTATAAAC 
23 HK4 5 0 CGTCCCTtTTCACaaCgGGGGCgTCTCAGAAAATCCAGCTTATAAAC 

13 HK5 5 0 CGTCCCTgTTCgCCCCTGGGcCTTCTCAGAAAATCCAGCTTATAAAt 
2 9 T2 50 CtGCCTTcTTCaCCCCTGGCgCTagcCAGAgggTtCAGCTCATTAAC 
2 7 T4 5 0 CCGGCTTgTTCtCCCCaGGCtCTCaGCAGAAcATCCAGCTCATTAAC 

2 8 T9 5 0 CCaGCaTcTTCAgCCCTGGCGCCCgGCAGAAaATCCAGCTCATTtAt 
4 0 Zl 50 CCGGCCTaTTtACCCCTGGCGCCaAGCAGAAcATCCgGCTtATCAAc 

3 0 T8 5 0 CCGGCCTCTTcACCaCcGGtCCtCAGCAGAAAATCAAcTTaATCAAt 
3 2 DKll 5 0 CTaGCCTCTTtAaCtCTGGCCCCCAGCAGAAAATCAATTTGATCAAC 
31 DK8 50 CTcGCCTCTTcTCCCCTGGCGCCCAGCAGAAACTCAgTTTGATCAAC 
3 4 HKIO 50 CcAGCTTtTTTTCTCCGGGCGCCaAaCAGAAcCTGCAGCTGATCAAt 
3 5 S2 5 0 CTAGCTTcTTTTCTCCGGGCGCCcAGCAGAAACTGCAGCTGGTtAAC 
3 6 S5 2 5 0 CTAGCCTTTTTAGTaTGGGCGCCAAGCAGAAACTGCAGTTGGTCAAC 
3 7 S54 50 CTcGCCTTTTTAGTGTGGGCGCCAAaCAGAAcCTGCAGTTGGTCAAC 

3 8 DK12 50 CTAGCCTTTTTAGTGTGGGCtCCAAcCAGcAaCTGCAGCTaGTCAAC 

3 DK7 5 0 CTAGTCTcTTTgCACcAGGCGCCAAaCAGAACATCCAaCTGATCAgC 

4 USll 50 CTgGTCTTTTCtCACaAGGCGCCCAGCAGAACATCCAGCTGATCAAC 

5 SWl 5 0 CCAGTCTTTTCACgCgGGGCGCCCAGCAGAATATCCAGCTGgTCAAC 

6 DK9 50 CCAGTCTTcTCAgcCCGGGCGCCAAGCAGAATATtCAGCTGATCAAC 

1 SIB 5 0 CtAGgtTCtTCtCtCCGGGCGCCAAGCAGgACATCCAGCTaATCAAC 

2 S14 50 gcAaTCTCcTCgCaCCGGGCGCCAAGCAGAACATCCAGCTGATtAAC 

8 DRl 5 0 CTGGTCTCTTCaCgCgGGGCGCGCGGCAGAACgTCCAGTTGATCAAC 

7 DR4 50 CTGGTCTCTTCgaCCaGGGCGCGCGGCAGAAtATCCAGTTGATCAAC 

43 DK13 • 50 CCGGACTgTTCAcCagGGGttCcCAcCAGAACATaCAGCTCATtAAC 

44 SA6 5 0 CCTCACTTTTCAaCCCCGGGCCgAAGCAGAACTTGCAGCTCATAtAC 

4 5 SAl 5 0 CCTCgaTTTTCACCCCCGGGCCaAAGCAGAACTTGCAGCTCATAAAT 
4 6 SAl 3 50 CTTCaCTTTTCACCCCTGGGCCgcAGCAGAACTTGCAGCTCATAAAT 
4 2 Z6 5 0 CTaGCCTcTTCAaTTCTGGGCCtaAGCAGAACcTACAGCTCATTAAT 
41 Z7 50 CcGGCCTtTTCAcTTCTGGGCCCcAGCAaAAAtTACAGCTCATTAAc 

21 HK3 5 0 CgGGCtTcTTCAgCTCcGGGCCCtCTCAGAAAaTCCAGCTTATAAAT 

22 S9 5 0 CtGGCCTtTTTtCCTCTGGaCCgACTCAGAAACTCCAGCTTgTAAAT 

3 9 Z4 5 0 CCaGCCTCTTTACaTCTGGGCCcAgaCAaAAcCTCCAGCTgATAAAT 

4 8 SA7 5 0 CCTCACTCTTTACtgTCGGGGCAAAGCAGAATTTGCAGCTCATAAAT 
4 7 SA4 5 0 CtTCACTtTTcTCctTCGGGGCAcAGCAGAATTTGCAGCTCATAAAT 

9 D3 62 CGTCCATaTTtTCAacTGGGCCGgCTCAGAAgATCCAGCTTGTAAAC 
10 D 1 6 2 t GTCCATgTTcTCAc t TGGGCCG t CTCAGAAa ATCCAGCTTGT AAAC 

1-4 9 consensus cctgccTctTcacccctGGggCcaagCAgaaaaTccagcTcaTaaac 
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FIGURE 2A 

Aligxment of HVR (aa) of HCV isolates of subtype la (I) 



SEO ID 


NO Isolate 




56 


DR4 


1 


gTqvsGGSAaRTvnAl ag 1 FdqGAr Qn I QL IN 


50 


S18 


1 


DTYaTGGS AsRT t qAf t r f Fs PGAKQd I QL I N 


51 


S14 


1 


DTYiTGGtAgRTvgtLsnLLaPGAKQNIQLIN 


55 


DK9 


1 


DT r VTGG s AARn t y GLa S LL s PG AKQN I Q L I N 


52 


DK7 


1 


sTHVTGGtAARAAfGiTSLFaPGAKQNIQLIs 


57 


DRl 


1 


t THVTGG S e ARAAS a LTGL F t r GAr QNvQL I N 


53 


USll 


1 


ETYVTGGSAGhAASGLaGLFsqGAQQNIQLIN 


54 


SWl 


1 


ETYtTGGaAGqtASGf tsLFtrGAQQNIQLvN 


50-57 


consensus 




dTyvtGGsaartasglt - If spGAkQniQLin 
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FIGURE 2B 

Alignment of HVR (aa) of HCV isolates of sxibtype lb (II) , 



SEO ID 


NO Isolate 




71 


S9 




gTtVTGAVQGRslQGltgLFSsGptQKlQLVN 


74 


DKl 


1 


TTHVTGAVQGRTTQGf ASLFSPGsaQKIQLVN 


64 


T3 


1 


TTHVsGGVs ARTThGLAS f FS PGpSQKI QLVN 


61 


TIO 


1 


sTr"VTnnTAARnT\/nT.ZX<^ 1 Papr;=i QnVTOT TTJ 

*3 A j_ V J. vj\j xxxrlXxil ± y \jJ-Lrt.O X r i-ilr 0,0 yj\.iVJJ_i J. In 


62 


HK5 


1 


aThVTGGTAAHtTRGLTSLFAPGpSQKIQLIN 


72 


HK4 


1 


nT YVTGG AAs H s TRGLT S L FT t G AS QK I QL I N 


63 


HK8 


1 


dTYVSGGAtaRnTyGLTSLFTPGAAQKIQLIN 


73 


S45 


1 


GTYTSGqAaGRTTaGFTSiFnPGsAQsIQLIN 


66 


SAIO 


1 


GT YT t GgAqGRTTs S FvG iFtPGPSQrl QL vN 


70 


HK3 


1 


sThTIGatvARTTQSwTGf FSsGPSQKIQLiN 


69 


INDB 


1 


hTni I GGreAsTTQGFTS 1 FSpGaSQKIQLVN 


65 


SW2 


1 


HTyTTGGeaAYnTRGFaSiFSSGpSQKIQLVN 


60 


PIO 


1 


rTHTTGGsvAYgTRrFTSLFTSGaSQKIQLVN 


67 


US 6 


1 


sTHvTGGaQAYaaRsFTSLFTPGsrQNIQLiN 


68 


IND5 


1 


qakTIGGrQAhtTgrlVSMFTPGPSQNIQLVN 


59 


Dl 


1 


saspGTrTIGGsQAkhTssiVSMFSlGPSQKIQLVN 


58 


D3 


1 


rggvGThTIGGaQAysvrgf tSiFStGPaQKIQLVN 


58-74 


consensus 




gth-tGgaqarttrgf tslFspGpsQkiQLvN 
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FIGURE 2C 

Alignment of HVR (aa) of HCV isolates of genotype 1. 



SEP ID NO Isolate 



59 


Dl 


1 


saspGTrTIGGsQAkhtssivSmFSlGPsQKIQLVN 


58 


D3 


1 


rggvGThTIGGaQAySvrGfTSiFStGPaQKIQLVN 


71 


S9 


1 


GTtvtGAvQgRSlQGlTGlFSSGPtQKlQLVN 


70 


HK3 


1 


sThTIGAtvARTTQswTGf FSSGPSQKIQLiN 


68 


IND5 


1 


qakTIGGrqAhTTgrlvSmFtpGPSQnIQLVN 


65 


SW2 


1 


HTyTTGGeaAYnTRgFaSiFsSGPSQKIQLVN 


60 


PIO 


1 


r ThTTGG s v A YgTRr FTS LF t SGASQ K I QL VN 


69 


IND8 


1 


hTniiGGreAsTTqGFTSLFsPGASQKIQLVN 


73 


S45 


1 


gTytsGqaaGRTTaGFTSiFnPGSAQsIQLiN 


74 




1 

X 


i i n V coa vqijK i i qvjr AoXr oFtjoA(JKXQLi VN 


64 


T3 


1 


TTHVS GGVs ARTThGLAS f FS PGps QK I QLVN 


56 


DR4 


1 


gTqVSGGS a ARTvnALAGL FdqGARQNI QL I N 


57 


DRl 


1 


tThVTGGSeARAASALtGLFtrGARQNvQLIN 


53 


USll 


1 


eTyVTGGSAghAASGLAGLFSqGAqQNIQLIN 


55 


DK9 


1 


dTR VTGGS AARNT YGLAS LIS PGAkQN I QL I N 


61 


TIO 


1 


s TRVTGG t AARNT YGLAS i Fa PG As QK I QL I N 


63 


HK8 


1 


dTYVs GGAt ARNT YGLTS L FT PGAaQK I QL I N 


72 


HK4 


1 


nTYVTGGAAsHsTRGLTSLFTtGASQKIQLIN 


62 


HK5 


1 


aTHVTGGTAAH t TRGLTS LFAPGpSQK IQL I N 


52 


DK7 


1 


sTHVTGGTAArAAfGiTSLFAPGakQNIQLIs 


67 


US 6 


1 


ETHVTGGAqAyAArsFTSLFTPGsrQNIQLIN 


54 


SWl 


1 


ETYTTGGAaGqTASgFTSLFTrGaqQNIQLVN 


66 


SAIO 


1 


gTYTTGGAqGRTTSsFvgLFTPGpsQrIQLVN 


50 


S18 


1 


DTYaTGGsAsRTTqaFtrf FsPGAKQdIQLIN 


51 


S14 


1 


DTYiTGGtAgRTvgtlsnllaPGAKQnIQLIN 


50-74 


consensus 




gt-vtGg-aarttrgltslf spGasQkiQLin 
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FIGURE 2D 

Alignment of HVR (aa) of HCV isolates of subtype 2a (III) 

SEP ID NO Isolate 

75 USIO 1 aTrTvGhsAayTAstf aglFnaGsRQnIQLIn 

77 T9 1 tThTsGgtAghTAyGLTsIFSPGaRQklQLIy 

76 T4 1 sstTiGSavasTTrGLTglFSPGsqQnIQLIN 

78 T2 1 hteltGSnagrTTqGLaaf FtPGasQrvQLIN 



75-78 consensus -t-t-Gs-a--T- -gl -giFspG- rQnlQLIn 
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FIGURE 2E 

Alignment of HVR (aa) of HCV isolates of sxibtype 2b (IV) 

SEP ID NO Isolate 

8 0 DK8 1 aTYTTGgQaARdTwgLArLFspGaQQKlsLIN 

79 T8 1 tTYTTGAQvARTTASLAgLFttGPQQKINLIN 

81 DKll 1 nTrvTGAiagRTaASLAsLFnsGPQQKINLIN 

79-81 consensus -TytTGaqaaRttasLA-LF- -GpQQKinLIN 
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FIGURE 2F 

Alignment of HVR (aa) of HCV isolates of genotype 2. 

SEP ID NO Isolate 

tTytTGasAGqqvQsfArlFsPGpnQhVQLvr 
hTe 1 TG s n AGR t TQGLAa f F t PG As Q r VQL I N 
aTYTTGgQAARdTwGLArLFs PGAQQKl sLIN 
tTYTTGAQvARTTASLAgLFttGPQQKINLIN 
nTrvTGAlAGRTAASLASLFnsGPQQKINLIN 
tThTsGgtAGhTAyGLTSiFSPGarQKIQLIy 
sstTiGsavAsTtrGLTGlFSPGSqQNIQLIN 
atrTvGhsaAyTastf aGlFnaGSrQNIQLIN 

ttyttGa-a-rtt-glaglFspG-qQkiqLin 



82 


S83 


1 


78 


T2 


1 


80 


DK8 


1 


79 


T8 


1 


81 


DKll 


1 


77 


T9 


1 


76 


T4 


1 


75 


USIO 


1 


75-82 


consensus 
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FIGURE 2G 

Alignment of HVR (aa) of HCV isolates of subtype 3a (V) . 



SEP ID NO Isolate 



83 


HKIO 


1 


gTYisGGhvARgASgLASFFSPGAkQnLQLlN 


84 


S2 


1 


ETYVTGGSaARSASrLASFFSPGAqQKLQLVN 


85 


S52 


1 


ETYVTGGSvAHSArGLTSLFSmGAKQKLQLVN 


86 


S54 


1 


aTYtTGGSAAHSAqGiTrLFSVGAKQnLQLVN 


87 


DK12 


1 


tThvTGGdAArStlrfTsLFSVGsnQqLQLVN 


83-87 


consensus 




eTyvtGGs aAr sasgl t s 1 FS - GakQ - LQLvN 
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FIGURE 2H 

Alignment of HVR (aa) of HCV isolates of subtype 4c, 

SEP ID NO Isolate 

90 Z7 1 tTmTTGGaaARtahAfTgLFtSGPqQkLQLIN 

91 Z6 1 eTvTTGGsvARstrAiTsLFnSGPkQnLQLIN 

90-91 consensus -T-TTGG- -AR A-T-LF-SGP-Q- LQLIN 
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FIGURE 21 

Alignment of HVR (aa) of HCV isolates of genotype 4. 



SEP ID NO Isolate 



89 


Zl 


1 


tTYasGaaAGrTtsgf aGLFTpGakQNIrLIN 


92 


DK13 


•1 


gT YvTGGqAGqTAf h 1 TGLFTr Gs hQN I QL I N 


90 


Z7 


1 


tTmTTGGaAARTAhAfTGLFTSGPqQkLQLIN 


91 


Z6 


1 


eTvTTGGsvARstrAiTSLFnSGPkQNLQLIN 


88 


Z4 


1 


hTsvsGGtqARaaqglTSLFtSGPrQNLQLIN 


88-92 


consensus 




tTy- tGgaaarta tgLFtsGpkQnlqLIN 



BNSDCX^D: <VV0_964G7B4AaJL> 



SUGSnrUTE sheet (rule 26) 



V 



wo 96/40764 



PCTAJS96/09340 



22/23 
FIGURE 2J 

Aligiiinent of HVR (aa) of HCV isolates of subtype 5a. 



SEP ID NO Isolate 



93 


SA6 


1 


sTHsVgGsAAhtTsGF t S 1 FnPGPKQNLQLIy 


94 


SAl 


1 


rTHTVaCtAAysTRGFASiFTPGPKQNLQLIN 


95 


SA13 


1 


NTrTVGGsAAqgARGlASLFTPGPqQNLQLIN 


97 


SA7 


1 


NTHvVGGaAArsAsGmASLFTvGAkQNLQLIN 


96 


SA4 


1 


NTHisGGtAAktvqGf tSLFsf GAqQNLQLIN 


93-97 


consensus 




nThtvgG-AA Gf aSlFtpGpkQNLQLIn 
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FIGURE 2K 

Alignment of HVR (aa) of 49 HCV isolates of genotypes 1-6. 

SEP ID NO Isolate 



71 


S9 


1 


gTtVTGavqgRSlqglTgLFSsGptQkLQLVN 


87 


DK12 


1 


TThVTGgdAaRStlrFTsLFSvGsNQqLQLVN 


82 


S83 


1 


TTyTTGasAGqqvqSFArLFSPGpNQhvQLVr 


98 


HK2 


1 


TTTTGhAVGrTTsSLAGLFSPGakQNlQLIN 


76 


T4 


1 


SsTTIGsAVAsTTrgLTGLFSPGsqQNIQLIN 


70 


HK3 


1 


SThTIGatVARTTQswTGFFSsGpSQklQLIN 


78 


T2 


1 


hTelTGsnAgRTTQglaaFFtPGASQrvQLIN 


50 


S18 


1 


DT YaTGGs As RTTQa f t r FFs PGAKQd I QL I N 


51 


S14 


1 


DTYiTGGtAgRTVgtLsnLlaPGAKQNIQLIN 


56 


DR4 


1 


GTqVsGGsAaRTVnaLaGLFdqGArQNIQLIN 


92 


DK13 


1 


GTy VTGGqAgqTAf hLTGLFTr G s hQN I QL I N 


90 


Z7 


1 


t TmTTGG AAa r TAha FTGL FT sGpQQklQLIN 


54 


SWl 


1 


ETYTTGGAAGqTASGFTsLFTrGAQQNIQLvN 


53 


USll 


1 


ETYVTGGSAGhaASGLAgLFSqGAQQNIQLIN 


55 


DK9 


1 


dTRVTGGS AARNT YGLASL 1 S PGAkQN I QL I N 


61 


TIO 


1 


sTRVTGGtAARNTYGLASiFaPGAsQKIQLIN 


63 


HK8 


1 


dTYVsGGAtARNTYGLTSLFTPGAaQKIQLIN 


72 


HK4 


1 


nTYVTGGAAsHsTRGLTSLFTtGASQKIQLIN 


62 


HK5 


1 


aTHVTGGTAAHtTRGLTSLFAPGpSQKIQLIN 


52 


DK7 


1 


sTHVTGGTAARaAfGiTSLFAPGAKQNIQLIs 


97 


SA7 


1 


NTHWGGaAARsAsGmASLFTvGAKQNLQLIN 


95 


SA13 


1 


NTrtVGGsAAqgArGLASLFTpGPqQNLQLIN 


88 


Z4 


1 


hTsVsGGtqARAAqGLTSLFTsGPRQNLQLIN 


57 


DRl 


1 


tTHVTGGseARAAsaLTgLFTrGaRQNvQLIN 


67 


US 6 


1 


eTHVTGGaqAYAARsFTSLFTpGsRQNIQLIN 


60 


PIO 


1 


rTHTTGGSVAYgTRrFTSLFTSGasQklQLvN 


91 


Z6 


1 


ETvTTGGSVArSTRaiTSLFnSGpKQnLQLiN 


85 


S52 


1 


ETYvTGGSVAHSARGlTSLFSmGAKQkLQLVN 


86 


S54 


1 


ATYTTGGSAAHSAqGiTRLFSvGAKQnLQLVN 


80 


DK8 


1 


ATYTTGGQAARdTwGLARLFSpGAQQKLsLIN 


79 


T8 


1 


t T YTTG AQ V ARTT AS LAg LFttGPQQKINLIN 


81 


DKll 


1 


nTrVTGAiAgRTAASLASLFnsGPQQKINLIN 


84 


S2 


1 


eTYVTGGsAARsASrLASFFSPGAQQKLQLvN 


83 


HKIO 


1 


gT Y I SGGh vARg AS GLAS FFS PGAkQNLQL I N 




Q A A 


J. 


ninibQjGtaAKTvQGFTSLFSf GAqQNLQLIN 


69 


IND8 


1 


hTnllGGreAsTTQGFTSLFSPGAsQKIQLVN 


74 


DKl 


1 


TTHVtGaVqgRTTQGFASLFSPGsaQKIQLVN 


64 


T3 


1 


TTHVsGGVsARTThGlASf FSPGPSQKIQLVN 


65 


SW2 


1 


nTYTTGGeaAynTrGFASlFSsGPSQKIQLVN 


66 


SAIO 


1 


gTYTTGGAqGRTTS s FvGLFTPGPSQr I QLVN 


89 


Zl 


1 


tTYaSGaAAGRTTSGFaGLFTPGakQnIrLIN 


73 


S45 


1 


gTYTSGqAAGRTTaGFTS I FnPGsaQs IQLIN 


77 


T9 


1 


cTHTSGGtAGHTayGlTSIFsPGarQklQLIY 


93 


SA6 


1 


sTJ^sVGGsAAHTTsGFTSlFnPGPKQNLQLIY 


94 


SAl 


1 


rTH':"VaGtAAYsTrGFASIFtPGPKQNLQLIN 


75 


USIO 


1 


aTrTVGhsAAYTastFAglFnaGsrQNIQLIN 


68 


IISrD5 


1 


qakTTGGrQAhTTgrlVSMFtpGPSQNIQLVN 


59 


Dl 


1 


saspGTrTIGG.vQAkhTssiVSMFSlGPSQKIQLVN 


58 


D3 


1 


rggvGThTIGGaQAysvrgf tSlFStGPaQKIQLVN 


50-98 


consensus 
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5. Claims 1-14 (partially): The problem is the 
provision of proteins containing amino acid sequences of 
the HRVI of isolates of HCV of subtype 2C for use as 
vaccines, and nucleic acids encoding them. The solution 

is the nucleic acid/protein sequences SEQ. IDS 33 and 82. y 
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6. Claims 1-14 (partially): The problem is the 
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vaccines, and nucleic acids encoding them. The solution 
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vaccines, and nucleic acids encoding them. The solution 
is the nucleic acid/protein sequences SEQ.IDS 39-43 and 
88-92. 

8. Claims 1-14 (partially): The problem is the 
provision of proteins containing amino acid sequences of 
the HRVl of isolates of HC^/ of subtype 5a for use as 
vaccines, and nucleic acids encoding them. The solution 
is the nucleic acid/protein sequences SEQ.IDS 44-48 and 
93-97 . 

9. Claims 1-14 (partially): The problem is the 
provision of proteins containing amino acid sequences of 
the HRVl of isolates of HCV of subtype 6a for use as 
va-cines, and nucleic acids encoding them. The solution 
is the nucleic acid/protein sequences SEQ.IDS 49 and 98. 
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